Guy_ Posted April 17, 2017 Share Posted April 17, 2017 (edited) I'm trying to dive a bit deeper into back references, but only used them successfully in StringRexExpReplace before. I'm now wondering if this too is conceptually possible...? (or the best way to do something similar from just one pattern) I would have a user defined regex pattern that neatly selects ID and Product name, tagged as named groups <id> and <product>. Can I in the program catch just the group I want into a variable by adding something behind the user RegExp pattern? I was hoping to add a "forget everything, just give me this group," e.g. by means of adding '\K{id}', '\g{id}' etc. (tried many variations). Local $sText, $sUserRegx, $a_IDs, $a_Products $sText = "ID:1000" & @CRLF & "Computer" $sUserRegx = "^ID:(?<id>\d{4})\v(?<product>.*)" ; pseudo code $a_IDs = StringRegExp( $sText, $sUserRegx & [GIVE ME <id> ONLY], 3) $a_Products = StringRegExp( $sText, $sUserRegx & [GIVE ME <product> ONLY], 3) Thanks! Edited April 17, 2017 by Guy_ Link to comment Share on other sites More sharing options...
jchd Posted April 17, 2017 Share Posted April 17, 2017 Naming the captured patterns doesn't help here. Local $sText, $sUserRegx, $aResult, $sID, $sProduct $sText = "ID:1000" & @CRLF & "Computer" $sUserRegx = "^ID:(\d+)\R(.+)" $aResult = StringRegExp($sText, $sUserRegx, 3) $sID = $aResult[0] $sProduct = $aResult[1] MsgBox(0, "Captured fields", "ID = " & $sID & @CRLF & "Product = " & $sProduct) This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Guy_ Posted April 17, 2017 Author Share Posted April 17, 2017 (edited) As usual, the question can never be clear enough... Sorry The problem (I think) with your proposal is that when the Product comes before ID, the results are now reversed? Therefore I want the ability to use tags... (so I maybe could avoid using 2 separate regexp patterns, or other shenanigans ) I feel it would be an elegant solution. Are you 99% sure that what I proposed is not possible...? You cannot use "forget what went on before (like \K, that does save named groups), just show me this group now" inside a StringRegExp? Merci Edited April 17, 2017 by Guy_ Link to comment Share on other sites More sharing options...
Guy_ Posted April 17, 2017 Author Share Posted April 17, 2017 I might be reasoning wrongly here... I'll try to wrap my head around my own ideas some more first... Esp. since when the program gets the date, it would already be split in two different array rows maybe... But if I could capture the full thing in one row and then use the first proposed technique, that would work for me though. But I'm guessing groups I specifically wanted captured can not be kept in one row... Link to comment Share on other sites More sharing options...
iamtheky Posted April 17, 2017 Share Posted April 17, 2017 (edited) $sText = "Computer" & @CR & "ID:1000" ;~ $sText = "ID:1000" & @CR & "Computer" $sUserRegx = "(ID:\d+)" $aResult = execute('assign("sID" , StringTrimLeft(StringRegExp(StringstripCR($sText), $sUserRegx, 3)[0] , 3)) assign("Product" , StringRegExpReplace(StringstripCR($sText), $sUserRegx , ""))') MsgBox(0, "Captured fields", "ID = " & eval("sID") & @CRLF & "Product = " & eval("Product")) If there is going to be a hint like "ID:" in the string , capture it. Then you have the data you need to find your first value AND find the remainder which is the other value. If the strings are going to be a mixture of designators, then you will probably be building a bunch of cases to handle that stuff. Edited April 17, 2017 by iamtheky Guy_ 1 ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
Guy_ Posted April 17, 2017 Author Share Posted April 17, 2017 (edited) Interesting code iamthekey, but I'll have to brush up on my Chinese for that! I'm not sure it is what I'm asking. The idea is that there might be hundreds of scenarios that need their specific pattern. Also, sometimes, the ID will not be there, so I cannot depend on an easy selection and the program has to know what was the provided one. I just got a new idea that might work... Maybe I can manipulate the pattern of the first post so I do capture both bits in one row, and then apply my pattern on that row, but maybe using ... StringRegExpReplace( that whole row, my pattern, named group <ID>) StringRegExpReplace( that whole row, my pattern, named group <Product>) Edited April 17, 2017 by Guy_ Link to comment Share on other sites More sharing options...
Guy_ Posted April 17, 2017 Author Share Posted April 17, 2017 (edited) Yes, I think that idea of mine can work What I'll try is replacing both '?<id>' and '?<product>' with '?:' and put parenthesis around the whole pattern. I'll now have both bits in one row, to which I'll add some surrounding Returns maybe. Then I'll use the idea from the post above (with the original untouched pattern version). Edited April 17, 2017 by Guy_ Link to comment Share on other sites More sharing options...
jchd Posted April 17, 2017 Share Posted April 17, 2017 53 minutes ago, Guy_ said: The idea is that there might be hundreds of scenarios that need their specific pattern. Also, sometimes, the ID will not be there, so I cannot depend on an easy selection and the program has to know what was the provided one. I just got a new idea that might work... Maybe I can manipulate the pattern of the first post so I do capture both bits in one row, and then apply my pattern on that row, but maybe using ... Hundreds of patterns can't be handled by just one, especially when you mention that some variable and unknown part of what you want to capture isn't there. How can "the program know what was the provided one"? In the remplace part of StringRegexpReplace, your can't refer to named captured patterns, just capture number using $1, $2, $3 ... or \1, \2, \3, ... These are the only constructs available there. I'm not saying that what you have to do isn't possible, regexp or not, but that you ought to make explicit, in plain english pseudocode, what you have to do in every possible case. Guy_ 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
jguinch Posted April 17, 2017 Share Posted April 17, 2017 (edited) You can use a look-ahead assertion to "revert" the capture order, for example : #Include <Array.au3> ; Local $sText = "ID:1000" & @CRLF & "Computer" Local $sText = "Computer" & @CRLF & "ID:1000" Local $aItems = StringRegExp($sText, "(?|ID:(\d+)\R(\V+)|(?=\V+\RID:(\d+))(\V+))", 3) _ArrayDisplay($aItems) Edited April 17, 2017 by jguinch Guy_ 1 Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
Guy_ Posted April 17, 2017 Author Share Posted April 17, 2017 (edited) "Hundreds of patterns can't be handled by just one..." > I mean: imagine there are 1000 scenarios for, for example, an ID and Product name on a page. Someone can make 1000 regex patterns for how to capture one or both of those. I would love if that could be done in just one pattern per scenario and the program could figure out easily what is what (by the tags). "In the replace part of StringRegexpReplace, your can't refer to named captured patterns, " > Aaargh... So much info on named groups and then you can't really use them in a replace... I felt I was almost there with this solution, but couldn't get the replace back refs to work... Even though for example, I see (for Perl) ... "$+{name} inserts the capture in the replacement string." (www.rexegg.com/regex-capture.html) (For AutoIt we have to look in this one, I believe though? http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html ) $sText = "ID:1000" & @CRLF & "Computer" $sUserRegx = "^ID:(?<id>\d{4})\R(?<product>.*)" ; inside the code... $sUserRegx_TEMP = '(?:' & StringRegExpReplace( $sUserRegx, '\?<id>|\?<product>', '?:') & ')' $a_allBitsInOneRow = StringRegExp($sText, $sUserRegx_TEMP, 3) If Not @error Then $sSafeRow = @CRLF & $a_allBitsInOneRow[0] & @CRLF ; adding room in case the original pattern depends on it $i_ID = StringRegExpReplace($sSafeRow, $sUserRegx_TEMP, '$+{id}') $s_Product = StringRegExpReplace($sSafeRow, $sUserRegx_TEMP, '$+{product}') MsgBox(0,"", "Product: " & $s_Product & @CRLF & "ID: " & $i_ID) EndIf Edited April 17, 2017 by Guy_ Link to comment Share on other sites More sharing options...
Guy_ Posted April 17, 2017 Author Share Posted April 17, 2017 (edited) 50 minutes ago, jguinch said: "You can use a look-ahead assertion to "revert" the capture order, for example..." Meaning: there is always a trick to make group 1 into group 2, and vice versa? I kinda expected that, but figured it might get messy & less elegant (or above my paygrade). I'll study your example, thank you Upd.: Ok, what I think you are doing is combining 2 separate patterns into one, so you can always easily decide the order. That is still pretty good indeed. I may have to use that instead... It might be for the better in general. Thanks! The drawback might be that patterns are gonna get longer, cos it seems to me you often need to define both objects to precisely get one, so with that already there, adding group names would take less space than sometimes kinda doing the same thing twice, but in reverse. But so be it for now. Edited April 17, 2017 by Guy_ Link to comment Share on other sites More sharing options...
jchd Posted April 17, 2017 Share Posted April 17, 2017 1 hour ago, Guy_ said: So much info on named groups and then you can't really use them in a replace... There is no "replace" primitive in PCRE, only a match engine. You can always use alternation to handle "this then that" or "that then this". You can get an array of 4 captures, with two of them being empty strings. Concatenating results 0 and 3 yields this, doing the same with result 2 and 4 yields that. Guy_ 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Guy_ Posted April 17, 2017 Author Share Posted April 17, 2017 1 hour ago, jchd said: There is no "replace" primitive in PCRE, only a match engine. I will look at the basics again. Been out of it a while... But it does not seem unreasonable if you can StringRexExpReplace with $1, $2, ... what is so different to do it with a named group really? Anyway, I think I will be happy trying The Jguinch Method But thank you everyone! Link to comment Share on other sites More sharing options...
jchd Posted April 17, 2017 Share Posted April 17, 2017 Again, the replace machinery isn't part of the legacy PCRE library API. Support of named groups and much more many constructs would require PCRE to make the replacement code work closely with the match process. PCRE by itself isn't that ambitious. It's however quite possible to add support for fancy constructs to a "replace" piece of code but that requires strong (and non trivial) links between the matching code and the replacing code. Such thing would be slightly easier to implement by using PCRE2 (a more recent release of the PCRE library) but as far as AutoIt goes, don't expect that much in a short laps of time. Most of the time, named groups and named subroutines are only used within the matching pass. Guy_ 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now