InclusiveExclusion Posted March 18, 2021 Share Posted March 18, 2021 Hey Guys, I'm trying to understand repeating regex patterns. The example could be easily done with StringSplit I know but it's a good example to use to address my long time neglect of the repeating regex stuff. I've attached the datafile so anyone who wants to help has exactly what I'm using. What I'm trying to achieve is to turn this text file into a 1d array with each line as it's own element (except the empty line in between blocks) My code looks like this: #include <Array.au3> $a = FileRead(@ScriptDir & "\Versions.log") $regexp = "(.+)\r\n(.+)\r\n(.+)\r\n(.+)\r\n(.+)\r\n" ;this works ;~ $regexp = "(?:(.+)\r\n){1}" ;this also works and i have no idea how $a_regfind = StringRegExp($a,$regexp,3) _ArrayDisplay($a_regfind,"$a_registryfind") The long regex at the top works fine. The second regex also works fine but i thought that since theres 5 repeats of the pattern that I should put 5 in there. That only returns the last line of each block. Changed to a 1 and it works. Feel like an ape that just accidentally bumped a light switch on and screeched at the scary lightbulb 🦍 If anyone has time would you please show me how to do this repeating regex properly? Versions.log Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 18, 2021 Share Posted March 18, 2021 @InclusiveExclusion Since you are using a global match ($STR_REGEXPARRAYGLOBALMATCH = 3), your pattern is searched for the whole file, and so, it would work even without the {1}. Look at this example: #include <Array.au3> #include <StringConstants.au3> Test() Func Test() Local $strFileName = @ScriptDir & "\Versions.log", _ $strFileContent, _ $arrResult $strFileContent = FileRead($strFileName) If @error Then Return ConsoleWrite("FileRead ERR: " & @error & @CRLF) $arrResult = StringRegExp($strFileContent, '(?m)^\s*([^\r\n]+)\s*$', $STR_REGEXPARRAYGLOBALMATCH) If IsArray($arrResult) Then _ArrayDisplay($arrResult) EndFunc Splitting the pattern, you have this: ^ asserts position at start of a line \s matches any whitespace character (equivalent to [\r\n\t\f\v ]) * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) 1st Capturing Group ([^\r\n]) Match a single character not present in the list below [^\r\n] + matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy) \r matches a carriage return (ASCII 13) \n matches a line-feed (newline) character (ASCII 10) \s matches any whitespace character (equivalent to [\r\n\t\f\v ]) * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) $ asserts position at the end of a line Global pattern flags (?m) modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string) And, since the function SRE is using $STR_REGEXPARRAYGLOBALMATCH here too, then the pattern is applied globally to the string, so, no matter how many lines you do have in the file, as long as the pattern is verified, it is returned by the SRE function. By the way, use this website to check out what's going on with your patterns Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
Nine Posted March 18, 2021 Share Posted March 18, 2021 $regexp = "([^\v]*)\v*" Maybe ? “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 18, 2021 Share Posted March 18, 2021 @Nine Use + instead of * in the capturing group, otherwise the function returns even the last blank line Musashi 1 Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
Nine Posted March 18, 2021 Share Posted March 18, 2021 Didn't see that last line... “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
InclusiveExclusion Posted March 18, 2021 Author Share Posted March 18, 2021 Nice. Thanks for the detailed info lads 👍 Link to comment Share on other sites More sharing options...
Musashi Posted March 18, 2021 Share Posted March 18, 2021 5 minutes ago, Nine said: Didn't see that last line... I guess, @FrancescoDiMuro means this : "In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move." Link to comment Share on other sites More sharing options...
Nine Posted March 18, 2021 Share Posted March 18, 2021 @Musashi Yes I know. I just didn't bother to go all the way down... Musashi 1 “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
mikell Posted March 18, 2021 Share Posted March 18, 2021 $regexp = "\N+" FrancescoDiMuro and Musashi 2 Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 18, 2021 Share Posted March 18, 2021 I knew I should have bring a dog here Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
mikell Posted March 18, 2021 Share Posted March 18, 2021 7 minutes ago, FrancescoDiMuro said: I knew I should have bring a dog here FrancescoDiMuro 1 Link to comment Share on other sites More sharing options...
InclusiveExclusion Posted March 18, 2021 Author Share Posted March 18, 2021 13 minutes ago, mikell said: $regexp = "\N+" drops mic 😄 Link to comment Share on other sites More sharing options...
mikell Posted March 18, 2021 Share Posted March 18, 2021 BTW I suppose that when saying 'repeating regex patterns' you meant something like this $regexp = "(.+)(?=\R){5}" But the example you chose is not the best one for this because in this case the 'repeating' feature is automatic so such a syntax is useless InclusiveExclusion 1 Link to comment Share on other sites More sharing options...
Nine Posted March 18, 2021 Share Posted March 18, 2021 31 minutes ago, FrancescoDiMuro said: I knew I should have bring a dog here Or a mouse, he would have gone playing with it FrancescoDiMuro 1 “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 18, 2021 Share Posted March 18, 2021 @Nine Nice idea, but what do you think about this one? Here @mikell, *smooch smooch smooch* Spoiler (Joke, you know :D) Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
mikell Posted March 18, 2021 Share Posted March 18, 2021 A gaming mouse ? sounds much better FrancescoDiMuro 1 Link to comment Share on other sites More sharing options...
InclusiveExclusion Posted March 19, 2021 Author Share Posted March 19, 2021 8 hours ago, mikell said: BTW I suppose that when saying 'repeating regex patterns' you meant something like this $regexp = "(.+)(?=\R){5}" But the example you chose is not the best one for this because in this case the 'repeating' feature is automatic so such a syntax is useless that's what I was looking for. Just trying to understand repeating patterns Link to comment Share on other sites More sharing options...
seadoggie01 Posted March 19, 2021 Share Posted March 19, 2021 $regexp = "((?:.+\R){5})" Something like this makes sense if you want to capture the whole paragraph at a time. The inside non-capturing group gets a whole line and a newline, and the outer group captures 5 of them at once. Also, if you need help with Regular Expressions, RegEx101.com has nice explanations and a cool way to share RegEx --> https://regex101.com/r/7Ma34B/1 Sorry for being a day late @FrancescoDiMuro, but I'll chase @mikell away! FrancescoDiMuro 1 All my code provided is Public Domain... but it may not work. Use it, change it, break it, whatever you want. Spoiler My Humble Contributions:Personal Function Documentation - A personal HelpFile for your functionsAcro.au3 UDF - Automating Acrobat ProToDo Finder - Find #ToDo: lines in your scriptsUI-SimpleWrappers UDF - Use UI Automation more Simply-erKeePass UDF - Automate KeePass, a password managerInputBoxes - Simple Input boxes for various variable types Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 19, 2021 Share Posted March 19, 2021 (edited) @seadoggie01 Thanks! Now I know who to call when we need to chase flavoured cats Edited March 19, 2021 by FrancescoDiMuro seadoggie01 1 Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
mikell Posted March 20, 2021 Share Posted March 20, 2021 14 hours ago, seadoggie01 said: if you want to capture the whole paragraph at a time Yes But if the purpose is to get for each paragraph the lines in a subarray then a 2nd step with StringSplit is needed StringRegExp with flag 4 could be used too but in this case the 'long' syntax (the OP's one in post #1) is unavoidable #include <Array.au3> $a = FileRead(@ScriptDir & "\Versions.log") $regexp = "(.+)\R(.+)\R(.+)\R(.+)\R(.+)\R" $a_regfind = StringRegExp($a,$regexp,4) For $i = 0 to 3 _ArrayDisplay($a_regfind[$i],"$a_registryfind") Next Why so much dogs mentioned around here ? FrancescoDiMuro 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now