SoftWearInGinEar Posted March 16, 2023 Share Posted March 16, 2023 Hello to All!, I am new to this Forum, and still somewhat new to AutoIt. I just would like to know how to make a function I can put in my script that can do the following: I have an output file that is generated (.dat) with six columns, and an unspecified number of rows. Sometimes I get a zero and/or blank columns (see attached). I would like this function to read through and delete the the rows with zeros and blank lines after my script finishes running. Any help would be appreciated! OutputData.dat Link to comment Share on other sites More sharing options...
jchd Posted March 16, 2023 Share Posted March 16, 2023 You can do that with FileRead, StringRegexReplace and FileWrite. See help. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Nine Posted March 16, 2023 Share Posted March 16, 2023 You could use FileReadToArray to get all lines into an array, then use StringSplit with @TAB to get all fields into another array, from then you would only have to delete undesired rows, and save it to new file _FileWriteFromArray. “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
SOLVE-SMART Posted March 16, 2023 Share Posted March 16, 2023 Hi @SoftWearInGinEar, welcome to the forum 👋 . these are the first five rows of you attached OutputData.dat file: Run 2023/03/09 Column 1 Column 2 Column 3 Column 4 Column 5 Column 6 7:14:34 1.05 250 25 0.011% 162.7 7:15:11 1.29 250.1 24.98 0.011% 162.7 7:15:47 1.53 250.2 24.96 0.011% 162.7 Is the first line intended or did you add this line manually? I ask because it's not a valid part of such CSV format. Please keep this in mind when you start to tryout a CSV UDF (like "parseCSV.au3" or others). Or just stick to the suggestions of the previous speakers (@jchd and @Nine). Best regards Sven Stay innovative! Spoiler 🌍 Au3Forums 🎲 AutoIt (en) Cheat Sheet 📊 AutoIt limits/defaults 💎 Code Katas: [...] (comming soon) 🎭 Collection of GitHub users with AutoIt projects 🐞 False-Positives 🔮 Me on GitHub 💬 Opinion about new forum sub category 📑 UDF wiki list ✂ VSCode-AutoItSnippets 📑 WebDriver FAQs 👨🏫 WebDriver Tutorial (coming soon) Link to comment Share on other sites More sharing options...
ioa747 Posted March 16, 2023 Share Posted March 16, 2023 the OutputData.dat how many blank lines have it? I know that I know nothing Link to comment Share on other sites More sharing options...
SOLVE-SMART Posted March 16, 2023 Share Posted March 16, 2023 Just now, ioa747 said: the OutputData.dat how many blank lines have it? For the specific case only seven lines which match either a zero => pattern: [TAB]0[TAB] or [TAB][LineEnd] 😉 . I hope @SoftWearInGinEar will find out a way to do this on his own. That's why I didn't came up with a concrete code snippet. A small but valid RegEx pattern for the file would be: '(\t\t|\t$)' Best regards Sven Stay innovative! Spoiler 🌍 Au3Forums 🎲 AutoIt (en) Cheat Sheet 📊 AutoIt limits/defaults 💎 Code Katas: [...] (comming soon) 🎭 Collection of GitHub users with AutoIt projects 🐞 False-Positives 🔮 Me on GitHub 💬 Opinion about new forum sub category 📑 UDF wiki list ✂ VSCode-AutoItSnippets 📑 WebDriver FAQs 👨🏫 WebDriver Tutorial (coming soon) Link to comment Share on other sites More sharing options...
pixelsearch Posted March 16, 2023 Share Posted March 16, 2023 (edited) 1 hour ago, SOLVE-SMART said: Is the first line intended or did you add this line manually? I ask because it's not a valid part of such CSV format. Why is that ? If not mistaken, there are 5 Tabs per line in any lines of OP's output, no matter the 1st line got only its 1st column filled. The problem would be if some lines had not 5 Tabs in them, but I didn't find any. 2 hours ago, SoftWearInGinEar said: I would like this function to read through and delete the the rows with zeros and blank lines (...) I don't see any blank line in OP's output (e.g totally empty) but maybe there could be (?) The question is : what should be done with the 2nd line below ? Will you delete it because the last column is empty ? I guess you will because you don't want any column to be empty (or 0) 2 hours ago, SoftWearInGinEar said: Sometimes I get a zero and/or blank columns (see attached). but it's always good to have OP's confirmation Edit: and we should ask too, will you delete it if the value in last column was... 0, as in this altered line ? Edited March 16, 2023 by pixelsearch SOLVE-SMART 1 Link to comment Share on other sites More sharing options...
ioa747 Posted March 16, 2023 Share Posted March 16, 2023 12 minutes ago, SOLVE-SMART said: For the specific case only seven lines in my opinion Empty line: 20, 76, 96 start from 0 I know that I know nothing Link to comment Share on other sites More sharing options...
SOLVE-SMART Posted March 16, 2023 Share Posted March 16, 2023 10 minutes ago, pixelsearch said: Why is that ? If not mistaken, there are 5 Tabs per line in any lines of OP's output, no matter the 1st line got only its 1st column filled. The problem would be if some lines had not 5 Good catch @pixelsearch, thanks, I missed this. I saw the second line which seems to be the CSV header and I thought directly what about the first one 😅 . My understanding is/was: when a line contains a zero as column value, then delete this line when not all column values of a line containing values, delete this too Anyway, you're absolutly right about your last question 👍 . 14 minutes ago, pixelsearch said: [...] will you delete it if the value in last column was... 0, as in this added line ? Best regards Sven pixelsearch 1 Stay innovative! Spoiler 🌍 Au3Forums 🎲 AutoIt (en) Cheat Sheet 📊 AutoIt limits/defaults 💎 Code Katas: [...] (comming soon) 🎭 Collection of GitHub users with AutoIt projects 🐞 False-Positives 🔮 Me on GitHub 💬 Opinion about new forum sub category 📑 UDF wiki list ✂ VSCode-AutoItSnippets 📑 WebDriver FAQs 👨🏫 WebDriver Tutorial (coming soon) Link to comment Share on other sites More sharing options...
pixelsearch Posted March 16, 2023 Share Posted March 16, 2023 You're welcome @SOLVE-SMART imho checking for 0 in any column seems a bit radical to eliminate a row, but sure OP knows better than us For example, If you look at column 4, its values start from 25 to 20.78 in the last row, a slow decreasing process that could reach 0 in a few hours. If we look at column 1 (which seems to indicate time) it increases from 7:14:34 to 9:59:21 in the last row . If this is a continuous process running 24 hours, then Column 4 could indicate a value of 0 in a few hours ... and the concerned line will be deleted, when it shouldn't. End of "script" For the record, this post was just for fun, as I got no idea of the meaning of any column or the time spent before the output is reset etc... today I'm in a funny mood, for a change ! SOLVE-SMART 1 Link to comment Share on other sites More sharing options...
SOLVE-SMART Posted March 16, 2023 Share Posted March 16, 2023 (edited) Even this might not be relevant for the author of the thread (we will see), but I struggle with the correct RegEx pattern. I built some test data lines to prove my RegEx approach, but it's not correct. I assume that the red rectangle matches should not be matched, so how can I exclude these matches? (\t?0$|\t?0\t) 7:32:13 7.36 0 0.011% 162.7 7:32:13 7.36 d 0 0.011% 0 0 7.36 d 0 0.011% 0 0 7.36 d 0 0.011% 55 0 51:0 7.36 d 0 0.011% 55 0 7:30:20 0 7.36 d 0 0.011% 55 0 I guess you @pixelsearch can help me out with it 😊 ? Also for the record: If this shouldn't be part of the thread because it could be a different thread, then please excuse me. I didn't want to get out of context too much. Best regards Sven Edited March 16, 2023 by SOLVE-SMART Stay innovative! Spoiler 🌍 Au3Forums 🎲 AutoIt (en) Cheat Sheet 📊 AutoIt limits/defaults 💎 Code Katas: [...] (comming soon) 🎭 Collection of GitHub users with AutoIt projects 🐞 False-Positives 🔮 Me on GitHub 💬 Opinion about new forum sub category 📑 UDF wiki list ✂ VSCode-AutoItSnippets 📑 WebDriver FAQs 👨🏫 WebDriver Tutorial (coming soon) Link to comment Share on other sites More sharing options...
pixelsearch Posted March 16, 2023 Share Posted March 16, 2023 (edited) I always wished we opened 1 thread specially dedicated to any RegEx question, where everybody could ask a question (in the same thread) and be answered there. One day maybe... and I assure you it will be crowded Meanwhile, SOLVE-SMART, I'm struggling with the same issue as yours because my RegEx knowledge is poor. For example, with this subject closely related to yours, 6 lines, 5 Tabs per line (30 Tabs total) : 7:32:13 7.36 0 0.011% 162.7 0 7:32:13 7.36 d 0 0.011% 0 0 7.36 d 0 0.011% 0 0 7.36 d 0 0.011% 55 51:0 7.36 d 0 0.011% 0.001 7:30:20 0 7.36 d 0 0 First of all, I'm not even sure Tabs can be pasted usefully in The Forum code (aren't they transformed to spaces when other users try to copy paste the code ?) which makes it difficult to test for other users. If I apply this pattern : (?m)(^0)\t It returns 2 matches (the 2 0's at beginning of lines, followed by a tab), great ! 0 0 If I apply that pattern : (?m)\t(0$) It returns 4 matches (the 4 0's at the end of lines, preceded by a tab), fantastic ! 0 0 0 0 Why does it return 10 matches when I try (certainly wrongly) to combine both patterns ? (?m)(^0)\t|\t(0$) Row 0 : Row 1 : Chr(48) Row 2 : Row 3 : Chr(48) Row 4 : Chr(48) Row 5 : Row 6 : Chr(48) Row 7 : Chr(48) Row 8 : Row 9 : Chr(48) What should be done in this last pattern to return only 6 matches (2 + 4) ? When this question is solved, then we could add an alternation (OR e.g. |) to retrieve 0's in the middle of each line, when they're preceded by a Tab AND followed by a Tab, something like |\t0\t Just my 2 poor cts... Edited March 17, 2023 by pixelsearch nothing special Link to comment Share on other sites More sharing options...
pixelsearch Posted March 16, 2023 Share Posted March 16, 2023 (edited) This should do it : (?m)^0\t|(?<=\t)0(?=\t)|\t0$ I tried it first without the positive lookahead / positive lookbehind, with this pattern : (?m)^0\t|\t0\t|\t0$ But it failed on last line, where 2 0's are separated by a Tab character, as found in the the last line (and its 2 last columns) of the subject I indicated in my preceding post. With this "wrong" pattern \t0\t it seems difficult to grab the last 0 (though it's preceded by a Tab) because the preceding grabbed 0 "ate" the Tab following him, so the offset is placed now just before the last 0 and not before the last Tab, that's why the last 0 isn't grabbed. The advantage of positive lookahead / positive lookbehind is this : "They do not consume characters in the string, but only assert whether a match is possible or not." Jan Goyvaerts (Regular-Expressions) Edit: @SOLVE-SMART did it solve your example too ? Fingers crossed Edited March 16, 2023 by pixelsearch SOLVE-SMART 1 Link to comment Share on other sites More sharing options...
SOLVE-SMART Posted March 16, 2023 Share Posted March 16, 2023 (edited) Thank you very much @pixelsearch for your engagement 🤝 . Unfortunately I believe, it's still not enough. But this depends absolutly on the requirements of @SoftWearInGinEar. Please see the following test data and their matches of your RegEx patterns and one adjusted RegEx pattern by me. 💡 Please notice, I had to replace (?m) with (?:) in VSCode to get the RegEx pattern work. But in the AutoIt code it doesn't matter => both variants lead to the same result. Spoiler First screenshot, 8 matches => without the positive lookahead/positive lookbehind: Second screenshot, 9 matches => with the positive lookahead/positive lookbehind: Third screenshot, 13 matches => with additional check for "double TAB": I marked the missing cases by red rectangles. But this is only relevant when this assumption is true: when a line contains a zero as column value, then match and delete this line when not all column values of a line containing values (empty), then match and delete this too I guess this is only just for fun, because we try to do it in a robust way with several combinations and so one. Out of the specific file from the OP, the pattern (\t\t|\t$) is simply enough 😂 . Best regards Sven test-data.csv Edited March 16, 2023 by SOLVE-SMART pixelsearch 1 Stay innovative! Spoiler 🌍 Au3Forums 🎲 AutoIt (en) Cheat Sheet 📊 AutoIt limits/defaults 💎 Code Katas: [...] (comming soon) 🎭 Collection of GitHub users with AutoIt projects 🐞 False-Positives 🔮 Me on GitHub 💬 Opinion about new forum sub category 📑 UDF wiki list ✂ VSCode-AutoItSnippets 📑 WebDriver FAQs 👨🏫 WebDriver Tutorial (coming soon) Link to comment Share on other sites More sharing options...
SOLVE-SMART Posted March 16, 2023 Share Posted March 16, 2023 This would be the final, but very ugly RegEx pattern, to match the assumed criteria (see post above) 😀 . (?m)^0\t|^\t|(?<=\t)0(?=\t)|\t0$|\t$|\t\t Best regards Sven Stay innovative! Spoiler 🌍 Au3Forums 🎲 AutoIt (en) Cheat Sheet 📊 AutoIt limits/defaults 💎 Code Katas: [...] (comming soon) 🎭 Collection of GitHub users with AutoIt projects 🐞 False-Positives 🔮 Me on GitHub 💬 Opinion about new forum sub category 📑 UDF wiki list ✂ VSCode-AutoItSnippets 📑 WebDriver FAQs 👨🏫 WebDriver Tutorial (coming soon) Link to comment Share on other sites More sharing options...
pixelsearch Posted March 16, 2023 Share Posted March 16, 2023 (edited) 1 hour ago, SOLVE-SMART said: Unfortunately I believe, it's still not enough. If I'm not mistaken, it is enough... for the subject you provided in the 1st place in this post 7:32:13 7.36 0 0.011% 162.7 7:32:13 7.36 d 0 0.011% 0 0 7.36 d 0 0.011% 0 0 7.36 d 0 0.011% 55 0 51:0 7.36 d 0 0.011% 55 0 7:30:20 0 7.36 d 0 0.011% 55 0 14 0's are now correctly retrieved when using the positive lookahead / positive lookbehind pattern : (?m)^0\t|(?<=\t)0(?=\t)|\t0$ Now, to solve OP's issue certainly requires other patterns (e.g empty lines, multiple followed Tabs for example) and you provided the solution, bravo I'm very happy you started this conversation because it allowed us to discover the power of positive lookahead / positive lookbehind. Please just let me add a few notes from the author, which are interesting too : 18. Testing The Same Part of a String for More Than One Requirement Lookaround, which I [Jan Goyvaerts, the author] introduced in detail in the previous topic, is a very powerful concept. Unfortunately, it is often underused by people new to regular expressions, because lookaround is a bit confusing. The confusing part is that the lookaround is zero-width. So if you have a regex in which a lookahead is followed by another piece of regex, or a lookbehind is preceded by another piece of regex, then the regex will traverse part of the string twice. If I'm not mistaken, it's exactly what we did, "traversing part of the string twice." For example, with the line that we worked on, which ends with these 4 characters Tab 0 Tab 0 * The lookaround part checks for Tab 0 Tab, without consuming (eating) any Tab, and grabs the penultimate 0 * Then the \t0$ checks again the last Tab ("traversing part of the string twice") and grabs the last 0 preceded by a Tab . So the last Tab has been checked twice, consumed only once, and now we know why To our expert gurus: please be kind enough to correct anything wrong or badly expressed in our last RegEx posts Thanks ! Edited March 16, 2023 by pixelsearch SOLVE-SMART 1 Link to comment Share on other sites More sharing options...
SOLVE-SMART Posted March 16, 2023 Share Posted March 16, 2023 3 minutes ago, pixelsearch said: If I'm not mistaken, it is enough... for the subject you provided in the 1st place in this post You're right about this. 4 minutes ago, pixelsearch said: Now, to solve OP's issue certainly requires other patterns (e.g empty lines, multiple followed Tabs for example) and you provided the solution, bravo Thanks. In case it is the solution which the OP is looking for 😂 ?! 5 minutes ago, pixelsearch said: I'm very happy you started this conversation because it allowed us to discover the power of positive lookahead / positive lookbehind. Please just let me add a few notes from the author, which are interesting too : [...] I am happy about this too. Thanks for the additional notes and explanations which are very educational 👍 . 7 minutes ago, pixelsearch said: To our expert gurus: please be kind enough to correct anything wrong or badly expressed in our last RegEx posts Exactly, this would be great and possibly necessary 😅😇 ?! Thanks again for the good insights @pixelsearch 🤝 . Best regards Sven Stay innovative! Spoiler 🌍 Au3Forums 🎲 AutoIt (en) Cheat Sheet 📊 AutoIt limits/defaults 💎 Code Katas: [...] (comming soon) 🎭 Collection of GitHub users with AutoIt projects 🐞 False-Positives 🔮 Me on GitHub 💬 Opinion about new forum sub category 📑 UDF wiki list ✂ VSCode-AutoItSnippets 📑 WebDriver FAQs 👨🏫 WebDriver Tutorial (coming soon) Link to comment Share on other sites More sharing options...
SoftWearInGinEar Posted March 17, 2023 Author Share Posted March 17, 2023 Hey Everyone, Sorry to get back so late and thank you all for the help! I haven't finished reading all the other comments yet, but I feel I should address the one about the six columns and the first column first. There are six columns, the first was not put in manually, I am making a script that reads data off a program (let's call it program 'X' to not complicate it) and write it to an output file. The function below reads and writes the local time off my computer, and the data off of the GUI of X: Func WaitForStep($hX, $hFile) ;Do FileWriteLine($hFile, _ @HOUR & ':' & @MIN & ':' & @SEC & @TAB & _ Clock($hX) & @TAB & _ ControlGetText($hX, '', "[NAME:textBox29]") & @TAB & _ ControlGetText($hX, '', "[NAME:textBox20]") & @TAB & _ ControlGetText($hX, '', "[NAME:textBox10]") & @TAB & _ ControlGetText($hX, '', "[NAME:textBox32]")) ;$fSecsLast = $fSecs ;$fSecs = ControlGetText($hX, '', "[NAME:textBox83]") ; Until $fSecs == $fSecsLast EndFunc Link to comment Share on other sites More sharing options...
SoftWearInGinEar Posted March 17, 2023 Author Share Posted March 17, 2023 20 hours ago, pixelsearch said: Why is that ? If not mistaken, there are 5 Tabs per line in any lines of OP's output, no matter the 1st line got only its 1st column filled. The problem would be if some lines had not 5 Tabs in them, but I didn't find any. I don't see any blank line in OP's output (e.g totally empty) but maybe there could be (?) The question is : what should be done with the 2nd line below ? Will you delete it because the last column is empty ? I guess you will because you don't want any column to be empty (or 0) but it's always good to have OP's confirmation Edit: and we should ask too, will you delete it if the value in last column was... 0, as in this altered line ? Yes the whole row should be deleted if the last column is 0 or blank. It should delete it if any of them are 0 or blank. Link to comment Share on other sites More sharing options...
SoftWearInGinEar Posted March 17, 2023 Author Share Posted March 17, 2023 15 hours ago, SOLVE-SMART said: Thank you very much @pixelsearch for your engagement 🤝 . Unfortunately I believe, it's still not enough. But this depends absolutly on the requirements of @SoftWearInGinEar. Please see the following test data and their matches of your RegEx patterns and one adjusted RegEx pattern by me. 💡 Please notice, I had to replace (?m) with (?:) in VSCode to get the RegEx pattern work. But in the AutoIt code it doesn't matter => both variants lead to the same result. Reveal hidden contents First screenshot, 8 matches => without the positive lookahead/positive lookbehind: Second screenshot, 9 matches => with the positive lookahead/positive lookbehind: Third screenshot, 13 matches => with additional check for "double TAB": I marked the missing cases by red rectangles. But this is only relevant when this assumption is true: when a line contains a zero as column value, then match and delete this line when not all column values of a line containing values (empty), then match and delete this too I guess this is only just for fun, because we try to do it in a robust way with several combinations and so one. Out of the specific file from the OP, the pattern (\t\t|\t$) is simply enough 😂 . Best regards Sven test-data.csv 913 B · 2 downloads Yes that it what I am looking for, delete line if any of the columns are 0 or blank. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now