martijn Posted September 27, 2006 Share Posted September 27, 2006 I really don't recommend that until we actually decide we want to go this route. Testing is okay but don't write mission-critical applications with any test executables because nothing is final yet.No problemo Link to comment Share on other sites More sharing options...
sohfeyr Posted September 28, 2006 Author Share Posted September 28, 2006 It would be very difficult but it basically requires dropping some supported operating systems or writing a ton of code that Windows already implements for us if we do want to support them... Then there is porting the existing code to use WCHAR instead of CHAR. That is probably about as much effort as writing all the wrappers.As I said, not something I expect to see any time soon Nice to have the description of the process involved, though. Mine:Time Functions - Manipulate the system clock! | WinControlList (WinGetClassList++) | .Net Setup Wrapper, Detect or install .Net | Writing and using a VB .NET COM object in AutoItNot mine, but highly recommended:AutoItTreeViewExtension plugin | Menu code | Callback helper dll | Auto3Lib - Control the uncontrollable | Creating COM objects in AutoIt | Using .Net framework classes in AutoIt Link to comment Share on other sites More sharing options...
trids Posted September 28, 2006 Share Posted September 28, 2006 Just stumbled on this thread .. and wanted to add my vote of support Also, following some links that thomasl included with his PCRE wrapper (in another thread), I came across the following pages which offer an excellent introduction to regexps. For those who need to a quick introduction:http://perldoc.perl.org/perlrequick.htmlhttp://perldoc.perl.org/perlretut.htmlThey also include some examples that might prove useful for testing the AU3 implementation, as they spell out the results and subtleties for various expressions and features.HTH Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 1, 2006 Administrators Share Posted October 1, 2006 Test AutoIt Exe: http://www.autoitscript.com/autoit3/files/...utoIt3-pcre.exe/////////////////////////////////////////////////////////////////////////////// // // $val = StringRegExp("string", "pattern", [flag, [offset]]) // // Perform regular expression matching on the given string. // // flags: // 0(default) - returns 1 (matched) or 0 (no match) // 1 - return array of matches // // When flag = 1: // Returns an array. // @Error = 0. Array is valid. Check @Extended for next offset // @Error = 1. Array is invalid. No matches. // @Error = 2. Bad pattern, array is invalid. @Extended = offset of error in pattern. // ///////////////////////////////////////////////////////////////////////////////Based on the php: preg_match function (seems to return entire match followed by matching subsubstring). Haven't done a global version yet because I don't know if this is working correctly yet (half the patterns I try don't work, but I don't know if they should work or if it is broken...) and I also have no idea how to return a global selection of data that would be meaningful. It's very hard to implement regexp code when you barely understand them, so help testing would be great.Here is code using the offset parameter to perform a manual global match.$nOffset = 1 While 1 $array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 1, $nOffset) If @error = 0 Then $nOffset = @extended Else ExitLoop EndIf for $i = 0 to UBound($array) - 1 msgbox(0, $i, $array[$i]) Next WEnd Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
steve8tch Posted October 1, 2006 Share Posted October 1, 2006 Checked out a few of my regexs (including some that I used to have issues with) - most of them are quite simple - but it seems to be behaving fine. Link to comment Share on other sites More sharing options...
Valik Posted October 1, 2006 Share Posted October 1, 2006 (edited) I just tested one of my patterns and didn't even have to change it (That was unexpected). It worked mostly but the returned array contained data I didn't expect.Take this simple script:Main() Func Main() Local $s = "abcdef" Local $p = "(ab)(cd)" Local $a = StringRegExp($s, $p, 1) ConsoleWrite('@@ (48) :(' & @min & ':' & @sec & ') UBound($a) = ' & UBound($a) & @CR);### Debug Console For $i = 0 To UBound($a) - 1 ConsoleWrite($a[$i] & @CRLF) Next EndFunc; Main()The output is:@@ (48) :(59:13) UBound($a) = 3 abcd ab cdI expected:@@ (48) :(59:13) UBound($a) = 2 ab cdEdit: Fixed the post up a bit. Edited October 1, 2006 by Valik Link to comment Share on other sites More sharing options...
Valik Posted October 1, 2006 Share Posted October 1, 2006 Just tried another expression. It looks like you have to escape $ when it's not being used as an anchor. For example, I had the pattern "$\((.*?)\)" which would match things like "Foo" in the string "$(Foo)". In order to make that pattern compatible with PCRE, I had to make it "\$\((.*?)\)".So far I'm optimistic that our patterns won't be too broken by using PCRE. Just need to get the damn "too-much-data" problem fixed. StringRegExp() would return this using the pattern and string mentioned above:$(Foo) FooAgain, the first line should not be there. The group only specified that "Foo" should be captured. Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 1, 2006 Administrators Share Posted October 1, 2006 The first array entry seems to be something to do with a full match, the one in php does the same (and also, the implementation that tylo did a while ago has this too). So I thought I'd keep it the same. Edit: The comment from php's preg_match: $matches[0]will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.Whether it's useful or not I have no idea whatsoever. I had a go at the replace stuff and decided I'd had enough for one day. Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Valik Posted October 1, 2006 Share Posted October 1, 2006 The problem is, the existing implementation did not use it and that will break scripts. So far I'm surprised at how compatible the expressions are. I guess David used Perl as a guide so a lot of patterns are going to work with PCRE out of the box. However, if the returned data is different than the "native" implementation, things are just as broken. I'll have to go through and adjust all my loops to start indexing at 1 instead of 0 even if the pattern itself works perfectly. That seems a shame to me since the patterns are what I thought would make the implementation incompatible. Link to comment Share on other sites More sharing options...
Valik Posted October 1, 2006 Share Posted October 1, 2006 (edited) Jon, here is my proposal. It's a combination of maintaining backwards compatibility and supporting what PCRE does by default. Here are the flags I propose:0 - Current behavior, returns True or False if the pattern matches.1 - Old behavior. Only return data that matches a group and only return the first matches. Example: Main() Func Main() Local $s = "abcdefabcdef" Local $p = "(ab)(cd)" Local $a = StringRegExp($s, $p, 1) ConsoleWrite("Matches: " & UBound($a) & @CRLF) For $i = 0 To UBound($a) - 1 ConsoleWrite($a[$i] & @CRLF) Next EndFunc oÝ÷ Øë¦ë¡×j×!zÎ|Ù¦Üw÷(uïåX¶5ì z¯Ó+"³Z´ý¸r§¦èºÑej °jÉ÷öÛ¬yا¶¨ÛޮȨÊ"µÆ§mæj^vÚ)z·è®kazw° Output: Matches: 6 abcd ab cd abcd ab cdThis will work because the flags in the old StringRegExp() were not bit-flags. This provides maximum compatibility so that any breakages will require very minor tweaks to the pattern. It also adds in the new functionality which I admit could be useful.Edit: For flag 4, I'm assuming that PCRE behaves the same with a global match that it does with a single match. If PCRE behaves exactly like flag 3, then flag 4 can be skipped. If the behavior of PCRE does not match flag 3 but does seem useful, then it can be put onto flag 4. Edited October 1, 2006 by Valik Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 1, 2006 Administrators Share Posted October 1, 2006 It has no concept of a global match, which is what I'm struggling with atm. You basically have to manually re-call it (like AutoIt example above) but we would implement it internally. If the interface doesn't output something mentioned above then I don't understand enough about it to make it do so If the new return value is of no use then we can ditch it, it just seemed odd that other implementations seemed to think it was something important to return which is why I left it in there. Adding more flags to support something 99% of users won't even have heard of seems a bit extreme. It's never been a release function after all. PS. I've got the simple version of StringRegExp replace working (no dollar substitutions etc) so I'll post that in a while. Edit: At least your post gives me some examples to play with. I was really struggling to find some. Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
spyrorocks Posted October 1, 2006 Share Posted October 1, 2006 If there was some way to make this exacly like the php function, i could really use it. [center] My Projects: Online AutoIt Compiler - AutoForum - AutoGuestbook - AutoIt Web-based Auto Installer - Pure AutoIt Zipping Functions - ConfuseGen - MindReader - P2PChat[/center] Link to comment Share on other sites More sharing options...
Valik Posted October 1, 2006 Share Posted October 1, 2006 It has no concept of a global match, which is what I'm struggling with atm. You basically have to manually re-call it (like AutoIt example above) but we would implement it internally. If the interface doesn't output something mentioned above then I don't understand enough about it to make it do so If the new return value is of no use then we can ditch it, it just seemed odd that other implementations seemed to think it was something important to return which is why I left it in there. Adding more flags to support something 99% of users won't even have heard of seems a bit extreme. It's never been a release function after all.PS. I've got the simple version of StringRegExp replace working (no dollar substitutions etc) so I'll post that in a while.Edit: At least your post gives me some examples to play with. I was really struggling to find some. I think it may be useful but I'm trying to keep as much backwards compatibility as possible. Like I said before, the patterns are pretty close and a lot of them are going to work out of the box with PCRE so it's a shame the output is not the same, otherwise this transition would be very smooth requiring only minor changes to patterns.From what you posted earlier (in private maybe), it sounded like the function with all in the name did a global search. I don't know what it's output would be, though I suspect it should be similar to flag 3 of David's implementation. Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 1, 2006 Administrators Share Posted October 1, 2006 From what you posted earlier (in private maybe), it sounded like the function with all in the name did a global search. I don't know what it's output would be, though I suspect it should be similar to flag 3 of David's implementation.Yeah, preg_match_all is the php function. But the underlying pcre api doesn't have a global option so it seems we have to do the global cleverness manually. There's no way to predict how many matches will be done so it seems like we'll have to keep calling the single match function and adding the matches to some sort of linked list and then when there are no more matches decide how to turn that into something useful for AutoIt.I'm leaving global until last, I think doing StringRegExpReplace looks easier. Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Valik Posted October 1, 2006 Share Posted October 1, 2006 It'd be nice to use std::vector for that. Wonder how much STL would increase size by? I wonder if we've gotten to the point we can use STL without too much size bloat? We could port a lot of stuff to STL... Link to comment Share on other sites More sharing options...
sohfeyr Posted October 1, 2006 Author Share Posted October 1, 2006 (edited) If the new return value is of no use then we can ditch it, it just seemed odd that other implementations seemed to think it was something important to return which is why I left it in there. Adding more flags to support something 99% of users won't even have heard of seems a bit extreme. It's never been a release function after all.I think the value in position 0 is very useful when parsing long documents. You can examine both your capturing groups and their context and relation to eachother. (.Net's implementation is similar: RegEx.Matches(n).Groups(0) returns the text that matched the whole expression.) If reverse compatibility is really an issue though, people like me could always just enclose the whole expression as a group. As long as nested groups are supported, that shouldn't be too big a problem. Personally, I like the flags idea. It would be easier for people to add a flag to their regexp calls than to go through and be sure of every 0-based loop that needs to become 1-based. Edited October 1, 2006 by sohfeyr Mine:Time Functions - Manipulate the system clock! | WinControlList (WinGetClassList++) | .Net Setup Wrapper, Detect or install .Net | Writing and using a VB .NET COM object in AutoItNot mine, but highly recommended:AutoItTreeViewExtension plugin | Menu code | Callback helper dll | Auto3Lib - Control the uncontrollable | Creating COM objects in AutoIt | Using .Net framework classes in AutoIt Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 I need a regexp that will match the $n or ${n} parts of of a string. I currently have "\\$(0-9]+)" which matches $1 $2 ok but I need also to cope with situtations that have {} like ${1} It's for the replacement parameter code in StringRegExpReplace - I was going to use a regexp to parse itself Oo This almost works: "\\${*(0-9]+)}*" but it allows for ${{{1}}} which is wrong, is there some way to say a match for 0 or 1 lots of { but no more? Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted October 2, 2006 Moderators Share Posted October 2, 2006 I need a regexp that will match the $n or ${n} parts of of a string.I currently have "\\$(0-9]+)" which matches $1 $2 ok but I need also to cope with situtations that have {} like ${1}It's for the replacement parameter code in StringRegExpReplace - I was going to use a regexp to parse itself OoThis almost works: "\\${*(0-9]+)}*" but it allows for ${{{1}}} which is wrong, is there some way to say a match for 0 or 1 lots of { but no more?I'm going to assume you're speaking of the current project you're working on and now the current releases version? Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Administrators Jon Posted October 2, 2006 Administrators Share Posted October 2, 2006 I'm going to assume you're speaking of the current project you're working on and now the current releases version?Yes. Deployment Blog: https://www.autoitconsulting.com/site/blog/ SCCM SDK Programming: https://www.autoitconsulting.com/site/sccm-sdk/ Link to comment Share on other sites More sharing options...
thomasl Posted October 2, 2006 Share Posted October 2, 2006 (edited) Test AutoIt Exe: http://www.autoitscript.com/autoit3/files/...utoIt3-pcre.exeThis looks pretty good, Jon. I have thrown some simple and quite a few of my more convoluted patterns at it and they work out okay. I did compare the output of AU3 to what the same pattern produces in Perl and with the expection of element[0] (whole match) they agree. Good job. FWIW, I agree about keeping backwards compatibility if at all possible. If someone really wants the whole match, another pair of parentheses does the trick, as sohfeyr pointed out. As to ${...}: try this: \$\{{0,1}\d+\}{0,1} EDIT:sorry, forgot the () around \d+: \$\{{0,1}(\d+)\}{0,1} Edited October 2, 2006 by thomasl Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now