Moderators SmOke_N Posted October 2, 2006 Moderators Share Posted October 2, 2006 (edited) Ok I had to download the exe and look at pcre's regexp's here http://perldoc.perl.org/perlre.html#Regular-ExpressionsBut this worked:$a = StringRegExp('blah ${1} blah', "\$\{{0,1}[0-9]+\}{0,1}", 1, 1) If IsArray($a) Then MsgBox(0, 'info', $a[0])Edit:Oops, thomasl was a bit fast for me... and he used /d (I was just happy to see the above work ) Edited October 2, 2006 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 Thanks, the pattern worked great. But I may have to manually parse the replacement string as it won't let me escape it so that the replace text is the literal text "$1" rather than a reference. Hard to explain. I think I need to support \1 \2 convention as well? Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
thomasl Posted October 2, 2006 Share Posted October 2, 2006 But I may have to manually parse the replacement string as it won't let me escape it so that the replace text is the literal text "$1" rather than a reference. Hard to explain.Well, convention says that $1, $2 ... is replaced by its respective group. If there are no valid groups, replacement is empty. So if you want a literal $1 in the replacement, you'd write something like \$1: the \ escapes the $. So if you search initially for something like "(\\{0,1}\$\{{0,1}(\d+)\}{0,1})" you'd get either a group starting with \ (->literal ${...}) or with $ (->replacement ${...}).Hm... perhaps better to parse that manually I think I need to support \1 \2 convention as well?Depends on whom you ask. I am much more used to the $1 syntax (and found the StringRegExpReplace() syntax a bit strange), but there's a sizeable minority out there who uses \1.Given that the current StringRegExpReplace() uses \1, why not stick with it? Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 New version: http://www.autoitscript.com/autoit3/files/...utoIt3-pcre.exeI've done StringRegExpReplace so test it out. I've also removed the full match return value in the array as requested.The regexp replace text can use $0 or ${1}. \1 \2 also work. \ must be escaped like \\. To get a real $ you must use \$Give it a test and let me know how it works out. Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
thomasl Posted October 2, 2006 Share Posted October 2, 2006 (edited) New version: [...] Give it a test and let me know how it works out. It all works out very well. The bugs I reported against the "old" RE version (and a few I didn't report) are gone. I have also done some very preliminary time diffs with some REALLY long strings (up 2048 kb) (as I did for the old version) and the PCRE library plus your replace code looks pretty good in this respect as well. Sometimes AU3 is a bit faster than Perl, sometimes a bit slower... but it's now very much in the same league, not a factor of 20, 40, even 100 slower, as it used to be. Very nice. Now I can scrap my Perl RE library. Well, was a pre-release anyway EDIT: PCRE uses slightly different definitions for its character classes and assertions (\b \d \w etc.). Anyone who is translating "old style" REs to new should check whether the classes are the same. I have run into some small differences that can wreck an otherwise working pattern. For instance, \w in PCRE includes 0..9. Edited October 2, 2006 by thomasl Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 I've done flag 3 (global) in StringRegExp.http://www.autoitscript.com/autoit3/files/...utoIt3-pcre.exeEdit: That should be all the existing functionality. If it tests OK I'll switch over to this code for the next beta and then delete all the bug reports Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
jftuga Posted October 2, 2006 Share Posted October 2, 2006 I just want to say that having PCRE support built into AU3 will be fantastic. I bet a lot of users will enjoy having this functionality. -John Admin_Popup, show computer info or launch shellRemote Manager, facilitates connecting to RDP / VNCProc_Watch, reprioritize cpu intensive processesUDF: _ini_to_dict, transforms ini file entries into variablesUDF: monitor_resolutions, returns resolutions of multiple monitorsReport Computer Problem, for your IT help deskProfile Fixer, fixes a 'missing' AD user profile Link to comment Share on other sites More sharing options...
steve8tch Posted October 2, 2006 Share Posted October 2, 2006 Do you want to try this under new Vs old $str = "abcd" $ptn = "(.*)" $msg = "" $aResult = StringRegExp($str, $ptn, 3) For $i = 0 To UBound($aResult) - 1 $msg &= $aResult[$i] & @CRLF Next MsgBox(0, "Result", $msg) On my PC - the new version never completes. It just eats up memory until the aplication fails due to lack of memory. (XP SP2) I have left out the check for @extended , because at the moment it always go to "0" Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 Fixed http://www.autoitscript.com/autoit3/files/...utoIt3-pcre.exe Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
steve8tch Posted October 2, 2006 Share Posted October 2, 2006 That was quick work, but... $str = "abcd" $ptn = "(.*?)" $msg = "" $aResult = StringRegExp($str, $ptn, 3) For $i = 0 To UBound($aResult) - 1 $msg &= $aResult[$i] & @CRLF Next MsgBox(0, "Result", $msg) This pattern has similar problem Sorry... Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 That was quick work, but... $str = "abcd" $ptn = "(.*?)" $msg = "" $aResult = StringRegExp($str, $ptn, 3) For $i = 0 To UBound($aResult) - 1 $msg &= $aResult[$i] & @CRLF Next MsgBox(0, "Result", $msg) This pattern has similar problem Sorry...What should this return? The pcre library is giving a really odd result back, it seems to be saying that there was a match of zero length and then gets stuck in a loop because it never advances. Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Valik Posted October 2, 2006 Share Posted October 2, 2006 I think that it should return the entire string but my expectation could be wrong. Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 If I run the expression in a test exe that comes with the pcre library in global mode it givesblank stringablank stringbblank stringcblank stringdblank stringI looked at the source to the test exe and there is this note/* If we have matched an empty string, first check to see if we are at the end of the subject. If so, the /g loop is over. Otherwise, mimic what Perl's /g options does. This turns out to be rather cunning. First we set PCRE_NOTEMPTY and PCRE_ANCHORED and try the match again at the same point. If this fails (picked up above) we advance to the next character. */I think this might be related as the match is indeed coming back as totally empty.Edit: Updated the test exe results, there is actually a blank in between each match Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 More:PCRE_NOTEMPTYAn empty string is not considered to be a valid match if this option is set. If there are alternatives in the pattern, they are tried. If all the alternatives match the empty string, the entire match fails. For example, if the pattern a?b?is applied to a string not beginning with "a" or "b", it matches the empty string at the start of the subject. With PCRE_NOTEMPTY set, this match is not valid, so PCRE searches further into the string for occurrences of "a" or "b".Perl has no direct equivalent of PCRE_NOTEMPTY, but it does make a special case of a pattern match of the empty string within its split() function, and when using the /g modifier. It is possible to emulate Perl's behaviour after matching a null string by first trying the match again at the same offset with PCRE_NOTEMPTY set, and then if that fails by advancing the starting offset (see below) and trying an ordinary match again. Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Valik Posted October 2, 2006 Share Posted October 2, 2006 For what it's worth, this is the equivalent in LUA: from, to, data = string.find("abc", "(.+)") print(data) And it produces "abc". Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 Updated: http://www.autoitscript.com/autoit3/files/...utoIt3-pcre.exeI've made it work like the pcre test exe in that when a global operation is done blank strings are matched so that it gives the odd result in the post above. Also odd things like doing a _global_ match on (.*) for "abcd" gives a match of:abcdblank stringThis is also the same result as I'm getting from the pcre test exe which I assume is the same as perl.I believe that if I just turn on the option to ignore blank string matches then the results will be more like the predications but not sure if that messes up some other elements of compatibility with perl. (Nutsters implementation of the (.*?) pattern actually returned nothing at all which I guess meant it came accross the blank string at the start and barfed) Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 For reference the pcretest file is at http://www.autoitscript.com/autoit3/files/...it/pcretest.exere> /(.*?)/g data> abcd Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Valik Posted October 3, 2006 Share Posted October 3, 2006 (edited) I found what I think might be an issue. Here is the code:Main() Func Main() Local $s = "Unique" & @CRLF & "Foo" & @CRLF & "Foo" Local $p = "Uniques*(?:(Foo)s*)*" Local $a = StringRegExp($s, $p, 3) ConsoleWrite("Matches: " & UBound($a) & @CRLF) For $i = 0 To UBound($a) - 1 ConsoleWrite($a[$i] & @CRLF) Next EndFunc ; Main()Here is the output:Matches: 1 FooThat's not what I expected. I expected:Matches: 2 Foo FooThe old StringRegExp() returns what I expected, the new doesn't. The example is supposed to look for a string starting with the text "Unique" optionally followed by whitespace (CRLF). If it finds that, then it's supposed to look for the string "Foo" optionally followed by whitespace (CRLF in the example). If it finds that, then it captures the text "Foo" (The non-capturing group is used to be able to test for the trailing whitespace but not capture it). With the sample string, it should find both instances of "Foo" since it's supposed to keep repeating the "(?:(Foo)\s*)*" part of the pattern.Edit: I see in the Perl documentation something about //s and //m and how //s means treat things as a single text block and //m means it's multiple lines and that //m is the default. I don't know if there is a way for me to change to the //s mode but that's what I need to be in for that pattern to match correctly.Edit2: I found the options (?s) and (?m) in the PCRE documentation which allows me to set those two flags I mentioned in my last edit. I still can't seem to get the pattern working how I want, though. Edited October 3, 2006 by Valik Link to comment Share on other sites More sharing options...
Valik Posted October 3, 2006 Share Posted October 3, 2006 I don't understand these Perl expressions. I don't understand why the following pattern doesn't work like I expect:Pattern: "(Foo)*" String: "FooFoo"That only matches one "Foo" and I expect 2. I even tried the test application and it didn't work like I thought, either:re> /(Foo)*/ data> FooFoo 0: FooFoo 1: Foo data> re> /(Foo)*/g data> FooFoo 0: FooFoo 1: Foo 0:As buggy as David's implementation was, at least simple patterns I expect to work... do. Is PCRE really just retarded or am I missing something completely obvious? Link to comment Share on other sites More sharing options...
sohfeyr Posted October 3, 2006 Author Share Posted October 3, 2006 I don't understand these Perl expressions. I don't understand why the following pattern doesn't work like I expect:...As buggy as David's implementation was, at least simple patterns I expect to work... do. Is PCRE really just retarded or am I missing something completely obvious?If you want it to return two captures of "Foo", try just /(Foo)/ , or even /Foo/ or Foo if the syntax will support it.You know what I'd really, really like to see support for in this implementation? Named groups. Those are SO much easier to remember and manage than $1 or \1 style backreferences. I haven't tried your code yet (you wouldn't believe how busy I am these days), but I've noticed all grouping that's been posted here uses numbered backreferences instead of named ones. Another nice resource: Online .Net RegEx TesterI know you aren't really trying to approximate .Net, but it's a handy way to do a quick, free regexp logic test. Someone else may have one that's specific to Perl. Mine:Time Functions - Manipulate the system clock! | WinControlList (WinGetClassList++) | .Net Setup Wrapper, Detect or install .Net | Writing and using a VB .NET COM object in AutoItNot mine, but highly recommended:AutoItTreeViewExtension plugin | Menu code | Callback helper dll | Auto3Lib - Control the uncontrollable | Creating COM objects in AutoIt | Using .Net framework classes in AutoIt Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now