genius257 Posted August 25, 2017 Share Posted August 25, 2017 (edited) So I'm having a issue with StringRegExp when using the offer parameter and using the start of string anchor if the offset is greater than 1 I just wonder if it's a bug or it is supposed to work like that? See example below StringRegExp("abc", "^[a-z]", 1, 1) ConsoleWrite(@error&@CRLF);success StringRegExp("abc", "^[a-z]", 1, 2) ConsoleWrite(@error&@CRLF);failure Thanks in advance Edited August 25, 2017 by genius257 My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
iamtheky Posted August 25, 2017 Share Posted August 25, 2017 They should both error, carat goes on the inside StringRegExp("abc", "[^a-z]", 1, 1) ConsoleWrite(@error&@CRLF);failure StringRegExp("abc", "[^a-z]", 1, 2) ConsoleWrite(@error&@CRLF);failure StringRegExp("abc", "abc", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "bc", 1, 2) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "c", 1, 3) ConsoleWrite(@error&@CRLF) ;errors StringRegExp("abc", "ab", 1, 3) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "a", 1, 2) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "[^abc]", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "[^bc]", 1, 2) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "[^c]", 1, 3) ConsoleWrite(@error&@CRLF) ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
genius257 Posted August 25, 2017 Author Share Posted August 25, 2017 NO. First RegEx is to get the "a", second RegEx is to get the "b" From the documentation: Quote Outside a character class, the caret matches at the start of the subject text, and also just after a non-final newline sequence if option (?m) is active. By default the newline sequence is @CRLF. Inside a character class, a leading ^ complements the class (excludes the characters listed there). My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
iamtheky Posted August 25, 2017 Share Posted August 25, 2017 (edited) ahh, i reversed it. context free is tough, but thats a start point and so it gets 'abc', and then 'bc' edit, tested real quick and with dashes im getting the largest susbset, not the smallest subset of the group - running more Edited August 25, 2017 by iamtheky ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
genius257 Posted August 25, 2017 Author Share Posted August 25, 2017 yeah but i gets "[a-z]" anywhere, not only if it's the first in the string. "^[a-z]" will return "b" when used on "a0bc" My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
iamtheky Posted August 25, 2017 Share Posted August 25, 2017 (edited) whered you get the quote from? If you put that carat there you are only getting the first character, and only if it's letter, and only if its lowercase. What is the desired end goal? #include<array.au3> $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 1) _ArrayDisplay($aMatch) $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 2) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 3) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("a0bc", "^[a-z]", 3, 4) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("A0bc", "^[a-z]", 3, 1) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) $aMatch = StringRegExp("A0bc", "^[a-z]", 3, 4) _ArrayDisplay($aMatch) ConsoleWrite(@error&@CRLF) Edited August 25, 2017 by iamtheky ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__) Link to comment Share on other sites More sharing options...
genius257 Posted August 25, 2017 Author Share Posted August 25, 2017 (edited) From the StringRegExp documentation in the Anchors table in Remarks. I'm iterating through a string, looking for exact matches: Global $Types[][] = [ _ ['^("[^"]*"|''''[^'''']*'''')',"String"], _ ["^\$[_a-zA-Z0-9]+","Variable"] _ ] $sOutput = "" $sInput = '$var = "this is a test"' $iOffset = 1 #include <Array.au3> While 1 StringRegExp($sInput, "^\s*(\S)", 1, $iOffset) If @error<>0 Then ExitLoop $iOffset = @extended For $i=0 To UBound($Types, 1)-1 $a = StringRegExp($sInput, $Types[$i][0], 1, $iOffset-1) If @error=0 Then $iOffset=@extended $sOutput&=$Types[$i][1]&";" ExitLoop EndIf Next WEnd I do know there are better ways of doing this, I'm just wondering if it's supposed to fail when using "^" and offset greater than 1 Edited August 25, 2017 by genius257 My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
mikell Posted August 25, 2017 Share Posted August 25, 2017 5 minutes ago, genius257 said: I'm just wondering if it's supposed to fail when using "^" and offset greater than 1 Obviously yes ! ^ matches at the start of the subject text , while offset is The string position to start the match First position (just after ^) is offset 1, so others (offset > 1) won't match if the ^ anchor is used - and if you don't use a workaround Link to comment Share on other sites More sharing options...
genius257 Posted August 25, 2017 Author Share Posted August 25, 2017 (edited) Thanks @mikell. It seems silly to me, as i see it, the offset would define where the string would be trimmed and matched, but i guess not. guess I'll haft to sub string myself and just add the @extended to the offset... >.> Edited August 25, 2017 by genius257 My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
mikell Posted August 25, 2017 Share Posted August 25, 2017 (edited) 14 minutes ago, genius257 said: guess I'll haft to sub string myself and just add the return to the offset... This is the workaround indeed Using offset you force the position where to start the match, so you'll jump into troubles if you do this with the ^ anchor in the pattern $offset = 1 $res = StringRegExp("a123b456c", "[a-z]", 1, $offset) $offset = @extended ConsoleWrite($res[0]&@CRLF) $res = StringRegExp("a123b456c", "[a-z]", 1, $offset) $offset = @extended ConsoleWrite($res[0]&@CRLF) $res = StringRegExp("a123b456c", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) Edited August 25, 2017 by mikell Link to comment Share on other sites More sharing options...
genius257 Posted August 25, 2017 Author Share Posted August 25, 2017 (edited) 14 minutes ago, mikell said: This is the workaround indeed Using offset you force the position where to start the match, so you'll jump into troubles if you do this with the ^ anchor in the pattern $offset = 1 $res = StringRegExp("abc", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) $offset += StringLen($res[0]) $res = StringRegExp("abc", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) $offset += StringLen($res[0]) $res = StringRegExp("abc", "[a-z]", 1, $offset) ConsoleWrite($res[0]&@CRLF) kinda. more like: StringRegExp(StringMid($sInput, $offset), "^[a-z]", 1) but it works now i guess.. Edited August 25, 2017 by genius257 forgot the anchor in the pattern My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
mikell Posted August 25, 2017 Share Posted August 25, 2017 (edited) Sorry, I edited my previous example, not sure you saw it... much better anyway... Edit .... because it's easy to use in a loop Edited August 25, 2017 by mikell Link to comment Share on other sites More sharing options...
genius257 Posted August 25, 2017 Author Share Posted August 25, 2017 The main problem with your solution is that if not using the anchor, it will match anywhere in the string. This will make it useless if the purpose it to iterate though it and process every char or do something else, should the match(es) fail. I appreciate all the help Anyway this is my result (I think my offset calculation will be wrong in some cases and should be adjusted at a later time, but it works for now ) Global $Types[][] = [ _ ['^("[^"]*"|''''[^'''']*'''')',"String"], _ ["^\$[_a-zA-Z0-9]+","Variable"] _ ] $sOutput = "" $sInput = FileRead(@ScriptFullPath) $sInput = '$var="this is a test" &"test"' $iOffset = 1 While 1 StringRegExp(StringMid($sInput, $iOffset), "^\s*(\S)", 1) If @error<>0 Then ExitLoop $iOffset += @extended-1 ConsoleWrite(StringMid($sInput, $iOffset-1)&@CRLF) $bMatch=False For $i=0 To UBound($Types, 1)-1 $a = StringRegExp(StringMid($sInput, $iOffset-1), $Types[$i][0], 1) If @error=0 Then $iOffset+=@extended-2 $sOutput&=$Types[$i][1]&";" $bMatch=True ExitLoop EndIf Next If Not $bMatch Then $sOutput&="Unknown"&";" WEnd MsgBox(0, "", $sOutput) My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
AspirinJunkie Posted August 25, 2017 Share Posted August 25, 2017 Maybe i misunderstood something but if you use an offset in StringRegExp and want to match from the beginning of the current position then you have to use \G instead of ^: StringRegExp("abc", "\G[a-z]", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "\G[a-z]", 1, 2) ConsoleWrite(@error&@CRLF) genius257 1 Link to comment Share on other sites More sharing options...
genius257 Posted August 25, 2017 Author Share Posted August 25, 2017 2 minutes ago, AspirinJunkie said: Maybe i misunderstood something but if you use an offset in StringRegExp and want to match from the beginning of the current position then you have to use \G instead of ^: StringRegExp("abc", "\G[a-z]", 1, 1) ConsoleWrite(@error&@CRLF) StringRegExp("abc", "\G[a-z]", 1, 2) ConsoleWrite(@error&@CRLF) Ah! you are right! Thank you ^^' Totally missed that. My highlighted topics: AutoIt Package Manager, AutoItObject Pure AutoIt, AutoIt extension for Visual Studio Code Github: AutoIt HTTP Server, AutoIt HTML Parser Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now