Jump to content

How to search only the second StringRegExp Or Second Random String?


Go to solution Solved by mikell,

Recommended Posts

26 minutes ago, HezzelQuartz said:

Could you recommend any books or website to learn about it?

1/ open AutoIt help about StringRegex: https://www.autoitscript.com/autoit3/docs/functions/StringRegExp.htm
2/ read it
3/ read it really
4/ read it again
5/ try your regexes on RegExp
6/ rinse, sleep and goto 1/

When proficient with all this, go read https://www.pcre.org/original/doc/html/pcrepattern.html in full detail.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

1 hour ago, HezzelQuartz said:

Should I use $ before bracket?
any difference with (?=\R\h+\])?

Try it !  the changes you will see in the results are the best answer
 

1 hour ago, HezzelQuartz said:

What is \ function before square bracket?

A bracket is a special character, for the regex to see it as a literal it must be escaped using a backslash

For the rest jchd said it all, especially the  3/ and 4/  sly.gif.2291096143bdc4f0a8af6e4b6d4ac181.gif
And there is a lot of sites about regex, I personally recommend this one

Link to comment
Share on other sites

@mikell

very interesting postings, thx!

Who is maintaining the docs pages of Autoit? Looking up the PCRE man pages I noticed, that at ...

https://www.autoitscript.com/autoit3/docs/pcrepattern.html

... the topmost URL "Return to the PCRE index page." is a dead link, pointing to https://www.autoitscript.com/autoit3/docs/index.html

Not Found
The requested URL /autoit3/docs/index.html was not found on this server.

 

Earth is flat, pigs can fly, and Nuclear Power is SAFE!

Link to comment
Share on other sites

Please post a ticket about that dead link.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

@mikell
Sorry, I have to continue topic about my last question
 

$text = FileRead(@ScriptDir & "\" & "database2.txt")

$extracted = StringRegExpReplace($text, '(?s)(^\h*"database":\N+\R\h+)(.*?)(\R\h*\]$)', "$2")
Msgbox(0,"", $extracted)
$new = StringReplace($text, $extracted, "****")
Msgbox(0,"1", $new)

$new = StringRegExpReplace($text, '(?s)database":\N+\R\h+\K(.*?)(?=\R\h+\]$)', "****")
Msgbox(0,"2", $new)

I tried that code (second code). It works great with my last sample

But why it doesn't work if I increase my database amount from 6 data to be 144144 data
Sorry, I cannot attach below, the file is too big

I also tried it at https://regex101.com/

It works for
small database amount, but it shows error (below) if I increase the database amount to be 144144 data (1081082 line)
"Catastrophic backtracking has been detected and the execution of your expression has been halted."

Thank you

Link to comment
Share on other sites

On 9/4/2023 at 2:56 PM, HezzelQuartz said:

Catastrophic backtracking has been detected

The regex engine is getting mad with such a huge work  :D

You could try this slightly different version of the first code

$text = FileRead(@ScriptDir & "\" & "database2.txt")

$extracted = StringRegExpReplace($text, '(^\h*"database":\N+\R\h+)|(\R\h*\]$)', "")
;Msgbox(0,"", $extracted)
$new = StringReplace($text, $extracted, "****")
Msgbox(0,"1", $new)

Edit
or - thanks to @pixelsearch - this second code modified (the lazy "?" was the culprit by forcing too much backtracking)

$new = StringRegExpReplace($text, '(?s)database":\N+\R\h+\K(.*)(?=\r\n\h+\]$)', "****")
Msgbox(0,"2", $new)

 

Edited by mikell
Link to comment
Share on other sites

I made some tests at Regex101, it looks like the site is just unable to handle a too long subject string
Please consider its mechanic answer as : "it's not my fault, it's yours obviously. catastrophic backtracking !"

Using a 400000 lines string and against this expression :  ^(.*)$  which needs no backtracking at all, after a  "catastrophic backtracking" error I got this one :
Unable to initialize the regex engine! Please try to reload the web page.
If the issue persists, please open an issue here: https://github.com/firasdib/Regex101/issues

And in the debugger page :
The regex debugger was unable to debug your pattern due to an error.
FAILED_TO_INITIALIZE_ENGINE

Edit
BTW I tested our last code in AutoIt on a 1500000 lines txt file and in worked nicely in about 1 second  :P

Edited by mikell
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...