Jump to content

convert tabs to spaces


Recommended Posts

A tab character moves the text following the tab to the next tab stop (in your example this is always a multiple of 4).
So you have to determine the position of the tab character, calculate the number of spaces needed to the next tab stop and replace the tab character with the resulting number of characters.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

35 minutes ago, Jos said:

Understand what I mean by tab-stops?  ( Fixed positions on the line where the tab would jump to)

Oh, this is a completely different view. Of course, a very long time ago, when I was in school, the term Tab-Stop was still very familiar to me, but the knowledge became blurred with time.

Now I realize again what the difference is between "replace with Tab-Wide" and "replace until Tab-Stop". :D I'm glad about this clarification (and I have to change my article in which I confused Tab-Wide with Tab-Stop). :P

Edited by Professor_Bernd
Link to comment
Share on other sites

15 hours ago, Professor_Bernd said:

After all this I realized that the reverse conversion of spaces to tabs can't work.

Not sure... untested but pos and width could be stored in an array which should allow the reverse... maybe :huh2:

#Include <Array.au3>

$sInputString = "This" & Chr(9) & "is" & Chr(9) & "a" & Chr(9) & "nice" & Chr(9) & "longbutnolittle" & Chr(9) & "test"

Local $a[0][0]

ConsoleWrite($sInputString & @CRLF) ; string with tabs
$new = _VirtualTabs($sInputString)
ConsoleWrite($new & @CRLF) ; string with only spaces (no tabs)
_ArrayDisplay($a)


Func _VirtualTabs($sString, $iTabStop = 8)
    StringReplace($sString, Chr(9), Chr(9))
    Local $iNrOfTabs = @extended ; number of tabs in the string (if any)
    Redim $a[$iNrOfTabs][2]
    For $i = 1 To $iNrOfTabs
        $sis = StringInStr($sString, Chr(9))
        $a[$i-1][0] = $sis
        Local $iMod = Mod($sis, $iTabStop)
        $n = (($iMod * -1) + $iTabStop * ($iMod <> 0)) + 1
        $a[$i-1][1] = $n
        $sString = StringReplace($sString, Chr(9), StringFormat('%' & $n & 's', ""), 1)
    Next
    Return $sString
EndFunc   ;==>_VirtualTabs

 

Link to comment
Share on other sites

16 hours ago, Professor_Bernd said:

After all this I realized that the reverse conversion of spaces to tabs can't work.

Excuse my poor English. Was my formulation wrong? Is this formulation better? :think:

Quote

After all this I realized that the reverse conversion, from spaces to tabs, can't work.

What I meant was a function with a reverse conversion direction. If the normal direction is to convert tabs to spaces, then the reverse direction would be to convert spaces to tabs. I think your function is more an undo function than a converter function SpacesToTabs. 

 

Link to comment
Share on other sites

Yes I understood "reverse conversion" as an undo function. Did you mean something different ? If so, I really can't see the interest of such a conversion, and what could be the expected result

Edit
Your English is not poor - mine is  :)

Edited by mikell
Link to comment
Share on other sites

18 hours ago, Professor_Bernd said:

... Can I post your code on the German Autoit Forum and name you as the author? If it's ok with you, I'm going to rename the function there a bit.

From
Func _VirtualTabs($sString, $iTabStop = 4)
to
Func TabsToSpaces($sString, $iTabSize = 4) ; or $iTabWidth ?  (see note 2)
Func TabsToSpaces($sString, $iTabStop = 4)

... or whatever you like best, the main thing is that a user can find the function when searching for it.

 

Hi @Professor_Bernd, feel free to use it as you better like...

 

18 hours ago, Professor_Bernd said:

Thanks for your good work! 👍

You are welcome. I'm glad You find it useful

 

18 hours ago, Professor_Bernd said:

... When I see your solution now, "StringFormat" is actually obvious for this task. But I really didn't come up with it. :huh2:

The key here is in the mod () function
in this case you can see the result of the mod function as if it tells you how many characters you have already used among those available before reaching the next tabstop; therefore, to know how many "spaces" remain available, just subtract this number from the "length of the tabulator" and the result gives us the number of spaces needed to replace the @tab in the string.

while the StringFormat function is used in a somewhat unusual way here, ie to generate a string with a certain number of spaces at will. (for an example about this, have a look to this function I posted some time ago that uses this technique)

with the union of these two we can easily achieve our goal

as an example is better than many words, I put here a slightly revised and more explanatory version of the function

$sInputString = "This" & Chr(9) & "is" & Chr(9) & "a" & Chr(9) & "nice" & Chr(9) & "longbutnolittle" & Chr(9) & "test"

ConsoleWrite($sInputString & @CRLF) ; string with tabs
ConsoleWrite(_TabsToSpaces($sInputString) & @CRLF) ; string with only spaces (no tabs)

Func _TabsToSpaces($sString, $iTabLen = 8)

    StringReplace($sString, Chr(9), Chr(9)) ; this is to "count" the number of @tab in the string
    Local $iNrOfTabs = @extended, $iMod ; number of tabs in the string (if any) is in @extended

    ; this loop processes all the @tab
    ; (if there are no tabs the loop is NOT executed at all and script goes directly to the return)
    For $i = 1 To $iNrOfTabs
        $iTabPos = StringInStr($sString, @TAB) ; position of the @tab, chr(9) within the string
        $iMod = Mod($iTabPos, $iTabLen) ; $iMod is the "number of chars already used" (of the total tab len)
        $iSpaceLeft = $iTabLen - $iMod ; $iSpaceLeft are spaces left before the next tabstop
        $iSpaceLeft *= ($iMod <> 0) ; this is same as IF $iMod = 0 then $iSpaceLeft = 0
        $sString = StringReplace($sString, Chr(9), StringFormat('%' & ($iSpaceLeft + 1) & 's', ""), 1)
    Next ; process next @tab
    Return $sString
EndFunc   ;==>_TabsToSpaces

 

Edited by Chimp

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

One small suggestion replace :

$sString = StringReplace($sString, Chr(9), StringFormat('%' & ($iSpaceLeft + 1) & 's', ""), 1)

with

$sString = StringReplace($sString, Chr(9), _StringRepeat(" ", $iSpaceLeft + 1), 1)

Makes it more readable IMO. 

Link to comment
Share on other sites

4 hours ago, Nine said:

One small suggestion replace :

$sString = StringReplace($sString, Chr(9), StringFormat('%' & ($iSpaceLeft + 1) & 's', ""), 1)

with

$sString = StringReplace($sString, Chr(9), _StringRepeat(" ", $iSpaceLeft + 1), 1)

Makes it more readable IMO. 

I agree, it makes it more readable; I wrote it that way just for the fun of staying more compact and native of AutoIt, avoiding the need to #include <String.au3>

 

image.jpeg.9f1a974c98e9f77d824b358729b089b0.jpeg Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to comment
Share on other sites

11 hours ago, mikell said:

Yes I understood "reverse conversion" as an undo function. Did you mean something different ?

In Posting #1 I have now reworded it and I hope it is easier to understand: "After all this I realized that the reverse conversion direction (from spaces to tabs) cannot work. (see posting)"

11 hours ago, mikell said:

If so, I really can't see the interest of such a conversion, and what could be the expected result

This is what I meant by "cannot work". ;)

Link to comment
Share on other sites

12 hours ago, Chimp said:

I put here a slightly revised and more explanatory version of the function

This is a good thing. Thanks.

There is one more vulnerability that could be considered, because your code is only suitable for small strings. For example, if you want to edit a file with more than 4,000 lines, it takes much too long. For testing, the "AutoIt3Wrapper.au3" is well suited because it is full of tabs and has almost 5,000 lines.

I made a copy and converted it with the code. After 5 minutes the conversion was still running and I aborted.

$sInputString = FileRead(".\test files\AutoIt3Wrapper - copy.au3")

; copy the result to the clipboard to paste it into a new Au3.
Global $sRet = _TabsToSpaces($sInputString, 4) & @CRLF ; string with only spaces (no tabs)
ClipPut($sRet)

Today I am not available almost the whole day. But if you ever want to work on a solution, I won't stop you. ^_^

Link to comment
Share on other sites

497 ms :

#include <String.au3>
#include <File.au3>

Local $hTimer = TimerInit()
Local $aFileLines
_FileReadToArray("Temp\AutoIt3Wrapper.au3",$aFileLines)
if @error Then Exit MsgBox ($MB_SYSTEMMODAL,"",@error)
For $i = 1 to $aFileLines[0]
  $aFileLines[$i] = _TabsToSpaces($aFileLines[$i])
Next
_FileWriteFromArray("Temp\AutoIt3Wrapper New.au3",$aFileLines,1)
MsgBox ($MB_SYSTEMMODAL,"",TimerDiff($hTimer))

Func _TabsToSpaces($sString, $iTabLen = 8)
  Local $iMod, $iSpaceLeft
  Local $iTabPos = StringInStr($sString, @TAB)
  While $iTabPos
    $iMod = Mod($iTabPos, $iTabLen)
    $iSpaceLeft = $iMod ? $iTabLen - $iMod : 0
    $sString = StringReplace($sString, Chr(9), _StringRepeat(" ", $iSpaceLeft + 1), 1)
    $iTabPos = StringInStr($sString, @TAB)
  WEnd
  Return $sString
EndFunc   ;==>_TabsToSpaces

 

Link to comment
Share on other sites

@Nine

I tried and studied your code right yesterday. The processing of the 5,000 lines file works in the blink of an eye. I find the use of _FileReadToArray() and _FileWriteFromArray() very interesting! I first thought about reading the file line by line, converting it line by line and then merging the lines into a complete text and saving it as a file. But ... your solution is much more elegant!

I also like your _TabsToSpaces code! I find it very readable (after I understood the ternary operator). ;)

Thanks for your good work to you too. 👍

Link to comment
Share on other sites

  • 2 weeks later...

Speed optimization of latest Nine's code:

In StringInStr() and StringReplace() use CaseSense=1, this is MUCH faster

orig:

Func _TabsToSpaces($sString, $iTabLen = 8)
  Local $iMod, $iSpaceLeft
  Local $iTabPos = StringInStr($sString, @TAB)
  While $iTabPos
    $iMod = Mod($iTabPos, $iTabLen)
    $iSpaceLeft = $iMod ? $iTabLen - $iMod : 0
    $sString = StringReplace($sString, Chr(9), _StringRepeat(" ", $iSpaceLeft + 1), 1)
    $iTabPos = StringInStr($sString, @TAB)
  WEnd
  Return $sString
EndFunc   ;==>_TabsToSpaces

optimised:

Func _TabsToSpaces($sString, $iTabLen = 8)
  Local $iMod, $iSpaceLeft
  Local $iTabPos = StringInStr($sString, @TAB, 1)
  While $iTabPos
    $iMod = Mod($iTabPos, $iTabLen)
    $iSpaceLeft = $iMod ? $iTabLen - $iMod : 0
    $sString = StringReplace($sString, Chr(9), _StringRepeat(" ", $iSpaceLeft + 1), 1, 1)
    $iTabPos = StringInStr($sString, @TAB, 1)
  WEnd
  Return $sString
EndFunc   ;==>_TabsToSpaces

 

Link to comment
Share on other sites

To further enhance speed I would create a string with spaces of length $iTabLen before the While loop.
In the While loop I would replace _StringRepeat with StringLeft.
This way you create the padding string only once.

An internal function is always faster than a UDF function.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

1 hour ago, Zedna said:

In StringInStr() and StringReplace() use CaseSense=1, this is MUCH faster

$STR_NOCASESENSE (0) = not case sensitive, using the user's locale (default)
$STR_CASESENSE (1) = case sensitive
$STR_NOCASESENSEBASIC (2) = not case sensitive, using a basic/faster comparison

As I understand it, this corresponds to the following tests

(0): If (substring = "a") Or (substring = "A") Then
(1): If substring = "a" Then
(2): I suspect a binary comparison(?)

Is that correct? And if that is correct, why are the two functions faster when using a case sensitive comparison when searching @Tab?

Thanks for your tip! :)

Link to comment
Share on other sites

When using CaseSense=0 then function GENERALY have to search all possible case combinations which is slower (even with numbers or special chars when case doesn't have true sense).

 

Here is testing script (here for numbers) where you can play with it and try other chars ...

$s1 = ""
For $i = 1 To 5000
    $s1 &= "1234567890"
Next
$s2 = $s1

; ***

$i = 0
$start = TimerInit()
While 1
    $i = StringInStr($s1, "1", 0, 1, $i+1)
    If $i = 0 Then ExitLoop
WEnd
ConsoleWrite("StringInStr case=0: " & TimerDiff($start) & @CRLF)

$i = 0
$start = TimerInit()
While 1
    $i = StringInStr($s1, "1", 1, 1, $i+1)
    If $i = 0 Then ExitLoop
WEnd
ConsoleWrite("StringInStr case=1: " & TimerDiff($start) & @CRLF)

; ***

$start = TimerInit()
While 1
    $s1 = StringReplace($s1, "1", "x", 1, 0)
    If @extended = 0 Then ExitLoop
WEnd
ConsoleWrite("StringReplace case=0: " & TimerDiff($start) & @CRLF)

$start = TimerInit()
While 1
    $s2 = StringReplace($s2, "1", "x", 1, 1)
    If @extended = 0 Then ExitLoop
WEnd
ConsoleWrite("StringReplace case=1: " & TimerDiff($start) & @CRLF)


#cs
Result:
StringInStr case=0: 45.8812141335013
StringInStr case=1: 33.798381685618
StringReplace case=0: 12787.2873669099
StringReplace case=1: 698.489838386663
#ce

 

Edited by Zedna
Link to comment
Share on other sites

1 hour ago, Professor_Bernd said:

Do you think this is a theoretical difference or a noticeable difference? (I have not tested it yet.)

An UDF function needs to be interpreted during runtime. An internal function is already compiled to machine code.
You will notice a difference if you call a function many times in  aloop.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...