Aeterna Posted November 12, 2008 Posted November 12, 2008 I have a .txt file B9191 P274852 B6262 P66 B7142325 P9649 B862615 P42813 P379443 Pretty much looks like this for 80,000 lines. I want to remove everything that isn't a P or a B while keeping the 1 letter per line structure. How can I do this? Thx in advance!
martin Posted November 12, 2008 Posted November 12, 2008 I have a .txt file B9191 P274852 B6262 P66 B7142325 P9649 B862615 P42813 P379443 Pretty much looks like this for 80,000 lines. I want to remove everything that isn't a P or a B while keeping the 1 letter per line structure. How can I do this? Thx in advance!Say your text is in a file called testreppb.txt , then $s = fileread("testreppb.txt") $s = StringRegExpReplace($s,"[^P,B,\r]","") filewrite("converted.txt",$s) will write the result to converted.txt Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
happyuser Posted November 12, 2008 Posted November 12, 2008 Be more precise, the question is not clear. Do you want to keep all lines begining by P or B or do you want to keep only B and P? Do you want to keep the structure of lines (CR/LF )? Try something like While 1 $line = FileReadLine($file) If @error = -1 Then ExitLoop If $line[1]='B' or $line[1]='P' then FileWriteLine($g,$line) Wend It should produce a new file with lines begining by P or B only.
rasim Posted November 12, 2008 Posted November 12, 2008 AeternaTry this:$file = @ScriptDir & "\test.txt" $sRead = FileRead($file) $sResult = "" $aPars = StringRegExp($sRead, "(?i)([B,P].*)\r\n", 3) For $i = 0 To UBound($aPars) - 1 $sResult &= $aPars[$i] & @CRLF Next $hFile = FileOpen($file, 2) FileWrite($hFile, $sResult) FileClose($hFile)
martin Posted November 12, 2008 Posted November 12, 2008 Be more precise, the question is not clear. Do you want to keep all lines begining by P or B or do you want to keep only B and P? Do you want to keep the structure of lines (CR/LF )? Try something like While 1 $line = FileReadLine($file) If @error = -1 Then ExitLoop If $line[1]='B' or $line[1]='P' then FileWriteLine($g,$line) Wend It should produce a new file with lines begining by P or B only.Well the OP seemed pretty precise to me. Everything expect P or B but keep one letter per line. But if were're talking precise, then what is $file and what is $g? And since FileReadLine returns a line of text then $line[1] will cause an error because you mean StringLeft($line,1). Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Richard Robertson Posted November 12, 2008 Posted November 12, 2008 It looked to me, "erase lines that don't have P or B, and make sure that there is only one letter the rest numbers." I think clarification is important.
Aeterna Posted November 12, 2008 Author Posted November 12, 2008 Sorry if theres any confusion. the resulting file should look like this B P B B B P P P and so on. There are a few lines with just numbers, those lines need to be removed completely. And I think because I'm going to try to parse this text file later, that all spaces should be removed too. Hope this clarified!
ProgAndy Posted November 12, 2008 Posted November 12, 2008 OK, this should work CODE$s = "B9191" & @CRLF & _ "P274852" & @CRLF & _ "B6262" & @CRLF & _ "P66" & @CRLF & _ "B7142325" & @CRLF & _ "P9649" & @CRLF & _ "B862615" & @CRLF & _ "s862615" & @CRLF & _ "P42813" & @CRLF & _ "P379443" & @CRLF $s = StringReplace(StringRegExpReplace($s,"(?s)([^PB\r\n])",""),@CRLF&@CRLF,@CRLF) MsgBox(0, '', $s) *GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes
Aeterna Posted November 12, 2008 Author Posted November 12, 2008 OK, this should work CODE$s = "B9191" & @CRLF & _ "P274852" & @CRLF & _ "B6262" & @CRLF & _ "P66" & @CRLF & _ "B7142325" & @CRLF & _ "P9649" & @CRLF & _ "B862615" & @CRLF & _ "s862615" & @CRLF & _ "P42813" & @CRLF & _ "P379443" & @CRLF $s = StringReplace(StringRegExpReplace($s,"(?s)([^PB\r\n])",""),@CRLF&@CRLF,@CRLF) MsgBox(0, '', $s) If you read the first line, I said this goes on for about 80,000 lines. So setting the variable like that wouldnt help right?
ProgAndy Posted November 12, 2008 Posted November 12, 2008 Well, this is just for testing Just replace the $s = ... with $s=FileRead("file"). $s=FileRead("file") $s = StringReplace(StringRegExpReplace($s,"(?s)([^PB\r\n])",""),@CRLF&@CRLF,@CRLF) MsgBox(0, '', $s)Now the only problem could be maximum string lentgh of 2147483647 characters *GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes
Valuater Posted November 12, 2008 Posted November 12, 2008 TESTED OK*********** Be sure to make a back-up copy of the file... FIRST #include <File.au3> $File = @ScriptDir & "\Log.txt" $x = "" While Not @error $x += 1 $Line = FileReadLine($File, $x) If @error Then ExitLoop _FileWriteToLine($File, $x, StringLeft($Line, 1), 1) WEnd 8)
ProgAndy Posted November 12, 2008 Posted November 12, 2008 This doesn't work with his additional information There are a few lines with just numbers, those lines need to be removed completely. *GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes
Valuater Posted November 12, 2008 Posted November 12, 2008 ok.... NOT TESTED #include <File.au3> $File1 = @ScriptDir & "\Log.txt" $File2 = @ScriptDir & "\New_Log.txt" $x = "" While Not @error $Line = FileReadLine($File1) If @error Then ExitLoop $Line = StringLeft($Line, 1) If IsNumber($Line) Then ContinueLoop $x += 1 If IsString($Line) Then _FileWriteToLine($File2, $x, $Line) WEnd 8)
youknowwho4eva Posted November 12, 2008 Posted November 12, 2008 #include <File.au3> $File1 = @ScriptDir & "\Log.txt" $File2 = @ScriptDir & "\New_Log.txt" While Not @error $Line = FileReadLine($File1) If @error Then ExitLoop $Line = StringLeft($Line, 1) If IsNumber($Line) Then ContinueLoop If IsString($Line) Then FileWriteLine($File2, $Line) WEnd Made a small edit, not sure if it's any better or worse, but it looks like it would work better. Looked like yours would be writing over itself and only adding new lines if there was a number in between. Giggity
Aeterna Posted November 12, 2008 Author Posted November 12, 2008 Thank you guys for all your help, we're getting close hehe!Some segments look like this P,1,5,2,0,9,6,9,024 31 6 45P,4,9,3,1,x,5,4,xP,5,9,9,4,2,1,3,5B,8,1,5,3,x,0,1,xthe end result should be1 Letter Per LineOnly B's and P's remainT's, #'s, and ","s should be removed.There should be no blank lines.
martin Posted November 12, 2008 Posted November 12, 2008 Thank you guys for all your help, we're getting close hehe! Some segments look like this the end result should be 1 Letter Per Line Only B's and P's remain T's, #'s, and ","s should be removed. There should be no blank lines. This will do what you want I think, but if there is a line with both P and B it will be replaced by P. #include <file.au3> Dim $array $s = _FileReadToArray("testreppb.txt", $array);read the text file to an array of lines $file = FileOpen("converted.txt", 2);open the file to write the results to For $n = 1 To $array[0] If StringInStr($array[$n], "P") Then FileWriteLine($file, "P") Else If StringInStr($array[$n], "B") Then FileWriteLine($file, "B") EndIf Next FileClose($file) Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now