Jump to content

StringRegExp - Pattern fails so is there a specific language I should be using?


Recommended Posts

Without knowing the final expected result any idea can be a misfit...

BTW the most reliable (and easy) way is not necessarily the shortest

Edit
A link to the whole html code would be nice too

Edited by mikell
Link to comment
Share on other sites

I fixed it... it wasn't easy... lot's of reading LOL.

With a little troubleshooting I was finding that whenever I put (\n|\n\N*\n){1,2} in my regex is would produce blank returns from my autoit loop.

So, <tr><td>\N*(\n|\n\N*\n){1,2}<td>\N+\n<td>\N*June\s2014\N*(\n|\n\N*\n){1,2}</tr> would actually produce 1 result, then 2 blank results.

(That's 1 big pattern result, then 2 empty / blank / LF results: 1 for group 1 and 1 for group 2 - as I was searching for the entire string, then searching for 1 group of LF "\n", then another group of LF "\n".)

Basically I was looking for single and double line feeds and I was accidentally getting outside of my pattern. I didn't understand the engine at the time. The way I was entering the regex was wrong. It was actually finding "groups" - so not only would I have my entire match, but I would get "line feed" 's returned... which were my AutoIT blank array results. In a regex tutorial I found my answer. "?:"

Help file online: "Parentheses Create Numbered Capturing Groups"
Example: color=(?:red|green|blue) is another regex with a non-capturing group. This regex has no quantifiers.
http://www.regular-expressions.info/brackets.html

<tr><td>.*(?:\n|\n.*\n)<td>.*\n.*June\s2014.*(?:\n|\n.*\n)</tr>

Before I go too far... I'm going to verify everything... yep, it's good - just verified.

Local $aArray = StringRegExp ($OpenSSL01, '<tr><td>.*(?:\n|\n.*\n)<td>.*\n.*June\s2014.*(?:\n|\n.*\n)</tr>', 4)

Local $aMatch = 0
For $i = 0 To UBound($aArray) - 1
    $aMatch = $aArray[$i]
    For $j = 0 To UBound($aMatch) - 1
        MsgBox($MB_SYSTEMMODAL, "RegExp Test with Option 4 - " & $i & ',' & $j, $aMatch[$j])
    Next
Next

Local $aArray = StringRegExp ($OpenSSL01, '<tr><td>.*(?:\n|\n.*\n)<td>.*\n.*May\s2014.*(?:\n|\n.*\n)</tr>', 4)

Local $aMatch = 0
For $i = 0 To UBound($aArray) - 1
    $aMatch = $aArray[$i]
    For $j = 0 To UBound($aMatch) - 1
        MsgBox($MB_SYSTEMMODAL, "RegExp Test with Option 4 - " & $i & ',' & $j, $aMatch[$j])
    Next
Next

 

Edited by souldjer777

"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Link to comment
Share on other sites

Information on capturing, non-capturing and other styles of groups in PCRE can also be found in AutoIt help file under StringRegExp.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

TADA...

We have all the answers, we just need questions.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Thank you both Mikell and Jchd.

:sorcerer:

Yeah, that was a cluster... sorry Mikell... believe me, I've gone back and forth and just never truly understood the issue. I simply didn't understand regex... I didn't even understand your answer. I was just frustrated - forest / trees.  ...and I have a LONG way to go. Thank you both again. :idiot:

:cheer:

So... uh, how do I mark this as "SOLVED" ?

Edited by souldjer777

"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Link to comment
Share on other sites

The best comment ever about matching/parsing HTML with regex is still this :)

There's no marking solved any more on this version of the forum software, but you can go back and edit the opening post to change the title.

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...