DW1 Posted February 19, 2014 Share Posted February 19, 2014 I must have a misunderstanding on how lazy quantifiers work. My expected return from below would be: "Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469" But I'm picking up almost the full string. I'm likely either using incorrect syntax or made a typo somewhere that I keep overlooking. Any help would be great #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] [<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] [<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, 'href="(.*?)">Yarp', 3) _ArrayDisplay($aTemp) AutoIt3 Online Help Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted February 19, 2014 Moderators Share Posted February 19, 2014 DW1,This works for me: Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>][<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] [<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>]' Local $sExtract = StringRegExpReplace($sString, '.*href="(.*)">Yarp.*', "$1") ConsoleWrite($sExtract & " - Extracted" & @CRLF) ConsoleWrite("Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469 - Required" & @CRLF)M23 DicatoroftheUSA 1 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
DW1 Posted February 19, 2014 Author Share Posted February 19, 2014 Thank you. Any idea why my lazy quantifier is failing though? AutoIt3 Online Help Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted February 19, 2014 Moderators Share Posted February 19, 2014 DW1,You need a guru for that - and I certainly do not qualify! M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
DW1 Posted February 19, 2014 Author Share Posted February 19, 2014 You sell yourself short, sir! I would have expected either of these to work: Local $aTemp = StringRegExp($sString, 'href="(.*?)">Yarp', 3) Local $aTemp = StringRegExp($sString, '(?U)href="(.*)">Yarp', 3) but it seems that I cannot get it to return a lazy result, just the greedy result... I'm super confused about this, and am hoping somebody can teach me to fish here. I have workarounds, but more than anything, I'd like to clear up my own confusion, as I'm likely doing something wrong. AutoIt3 Online Help Link to comment Share on other sites More sharing options...
DXRW4E Posted February 19, 2014 Share Posted February 19, 2014 (edited) #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] ' $sString &= '[<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] ' $sString &= '[<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, 'href="([^"\r\n]*)">Yarp', 3) _ArrayDisplay($aTemp) ;or $aTemp = StringRegExp($sString, 'href="?([^"\r\n\>\<]*)"?>Yarp', 3) _ArrayDisplay($aTemp) Ciao. Edited February 19, 2014 by DXRW4E DW1 1 Link to comment Share on other sites More sharing options...
DW1 Posted February 19, 2014 Author Share Posted February 19, 2014 #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] ' $sString &= '[<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] ' $sString &= '[<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, 'href="([^"\r\n]*)">Yarp', 3) _ArrayDisplay($aTemp) ;or $aTemp = StringRegExp($sString, 'href="?([^"\r\n\>\<]*)"?>Yarp', 3) _ArrayDisplay($aTemp) Ciao. Thank you. More valid workarounds. I'm still trying to get somebody to teach me to fish here though on why the lazy quantifier isn't working the way I expect it to. I am open to it being user error, I just want to know what the error is. AutoIt3 Online Help Link to comment Share on other sites More sharing options...
DXRW4E Posted February 19, 2014 Share Posted February 19, 2014 (edited) wat works OK, is the pattern who is not OK, not the RegExp #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] ' $sString &= '[<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] ' $sString &= '[<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, 'href="(.*?)">Yarp', 3) _ArrayDisplay($aTemp) so check href=" '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href=" and stops until ">Yarp Ciao. Edited February 19, 2014 by DXRW4E DW1 1 Link to comment Share on other sites More sharing options...
Factfinder Posted February 19, 2014 Share Posted February 19, 2014 Your original script would work with a little change, instaed of (.*?) use ([^>]*?) like this: $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] [<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] [<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' $aTemp = StringRegExp($sString, 'href="([^>]*?)">Yarp', 1) If IsArray($aTemp) Then MsgBox(0, "", $aTemp[0]) DW1 1 Link to comment Share on other sites More sharing options...
DW1 Posted February 19, 2014 Author Share Posted February 19, 2014 wat works OK, is the pattern who is not OK, not the RegExp #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] ' $sString &= '[<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] ' $sString &= '[<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, 'href="(.*?)">Yarp', 3) _ArrayDisplay($aTemp) so check href=" '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href=" and stops until ">Yarp Ciao. I understand that, and I have workarounds, however this doesn't address my question, as a lazy quantifier should be returning as little as possible while still matching, yet I'm still seeing the same result as a greedy quantifier. That's what I'm hoping somebody can correct me on. AutoIt3 Online Help Link to comment Share on other sites More sharing options...
DW1 Posted February 19, 2014 Author Share Posted February 19, 2014 To clarify for anybody wondering what I'm on about... I have plenty of workarounds to accomplish my task. What I am asking is why the lazy quantifier is not working as I thought it did in the following: #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] [<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] [<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, 'href="(.*?)">Yarp', 3) _ArrayDisplay($aTemp) Yes, this should match the entire string, however, I thought that adding the "?" after the quantifier "*" would make the match lazy, and grab as little as possible to match the expression. My question is, where is my syntax error or my misunderstanding. I understand how all of the workarounds are working. What I don't understand is why the lazy quantifier isn't working the way I thought it did. As I said previously, this is likely just a misunderstanding of mine, or a syntax error, but if somebody could answer how to get the lazy quantifier to work in this scenario, I'd appreciate it. I would expect a greedy quantifier (as much as possible while still matching) to return as it is in my above script: Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] [<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] [<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469 I would expect a lazy quantifier (as little as possible while still matching) to return the following: Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469 AutoIt3 Online Help Link to comment Share on other sites More sharing options...
DXRW4E Posted February 19, 2014 Share Posted February 19, 2014 (edited) I understand that, and I have workarounds, however this doesn't address my question, as a lazy quantifier should be returning as little as possible while still matching, yet I'm still seeing the same result as a greedy quantifier. That's what I'm hoping somebody can correct me on. yes right, but the '">Yarp' is already the first Match, so everything is ok try #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Yarp</A>] ' $sString &= '[<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] ' $sString &= '[<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, 'href="(.*?)">Yarp', 3) _ArrayDisplay($aTemp) or tell RegExp to find the last 'href=' #include <Array.au3> Local $sString = '<TD colSpan=4><SPAN style="FONT-VARIANT: small-caps">Sub-Categories: [<A href="Categories.aspx?id=2b249c75-f666-424e-b555-bf9ee8f34152">Nope</A>] ' $sString &= '[<A href="Categories.aspx?id=584378e4-917a-4fce-a6ff-e75fb966e36f">Nay</A>] [<A href="Categories.aspx?id=042394b4-1c0e-42d9-a9ba-bca9ea1394b1">Nada</A>] ' $sString &= '[<A href="Categories.aspx?id=8d72025e-a23a-4c81-8174-d3fc4e1eb469">Yarp</A>] ' Local $aTemp = StringRegExp($sString, '.*href="(.*?)">Yarp', 3) _ArrayDisplay($aTemp) Ciao. Edited February 19, 2014 by DXRW4E DW1 1 Link to comment Share on other sites More sharing options...
Solution Factfinder Posted February 19, 2014 Solution Share Posted February 19, 2014 (edited) The script I suggested is not a workaround. It is the correct script. Your script doesn't work because ? applies forward to string coming after href=" up to the first ">Yarp. So if you had a second ">Yarp in the string the ? would make it match only up to the first ">Yarp. As DXRW4E mentioned, your script start at the first href=" and ends at the first ">Yarp because ? doesn't work backwards. To eliminate all the href=" in the string except the one preceding ">Yarp you should use the script I suggested. Edited February 19, 2014 by Factfinder DW1 1 Link to comment Share on other sites More sharing options...
DW1 Posted February 19, 2014 Author Share Posted February 19, 2014 (edited) because ? doesn't work backwards This is what you and DXRW4E have both been pointing out to me, which is now clear. I wasn't understanding why the capturing group was not lazy, because I wasn't putting together the fact that the "?" doesn't work backwards. This makes perfect sense to me now, thank you both! EDIT: Marking Factfinder's post as the solution, however DXRW4E, I understand you were pointing out the same thing to me, I just didn't get it until his post spelled it out for me. Thank you both! Edited February 19, 2014 by DW1 AutoIt3 Online Help Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now