myspacee Posted March 5, 2019 Share Posted March 5, 2019 Hello, I'm working on command line tool that manipulate text to extract only info I need. I need this tool to HTML scraping. I don't know how to solve a problem. I've reducing BIG text to strings like this : [Lunedì 04/03]<30466>[16:50]<30467>[19:15]<R4nd0m>[21:20] I need to remove all text inserted in these symbol <> and obtain this : [Lunedì 04/03][16:50][19:15][21:20] But using StringRegExpReplace I obtain this : [Lunedì 04/03][21:20] Can you suggest me some code, or syntax, to solve this issue ? Thank you for your time, m. Link to comment Share on other sites More sharing options...
Dionysis Posted March 5, 2019 Share Posted March 5, 2019 what is the regular expression you are using? Link to comment Share on other sites More sharing options...
myspacee Posted March 5, 2019 Author Share Posted March 5, 2019 StringRegExpReplace($file_input_line, "\<(.*)\>", "") generally above; where <> can be symbols that i want. m. Link to comment Share on other sites More sharing options...
mikell Posted March 5, 2019 Share Posted March 5, 2019 Try this : "(<.*?>)" Fr33b0w, hudsonhock and pixelsearch 2 1 Link to comment Share on other sites More sharing options...
Dionysis Posted March 5, 2019 Share Posted March 5, 2019 Your expression is greedy, this must be ok. StringRegExpReplace($file_input_line, "\<(.*?)\>", "") Also, I don't think that you need the backslashes as "<" and ">" aren't regexp metachars Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 5, 2019 Share Posted March 5, 2019 @myspacee Global $strString = '[Lunedì 04/03]<30466>[16:50]<30467>[19:15]<R4nd0m>[21:20]' ConsoleWrite("Before: " & $strString & @CRLF & _ "After : " & StringRegExpReplace($strString, '(<[^>]+>)', '') & @CRLF) Fr33b0w and mikell 2 Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
myspacee Posted March 5, 2019 Author Share Posted March 5, 2019 25 minutes ago, mikell said: Try this : "(<.*?>)" This works. Thank you ! Another question : Is it possible to use StringRegExpReplace with words instead of single symbol ? eg: StringRegExpReplace($file_input_line, "(div.*?/div)", "") Thank you, m. Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 5, 2019 Share Posted March 5, 2019 @myspacee Post a sample string Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
myspacee Posted March 5, 2019 Author Share Posted March 5, 2019 4 minutes ago, FrancescoDiMuro said: @myspacee Post a sample string StringRegExpReplace($file_input_line, "(div.*?/div)", "") Where div and /div are common HTML codes. Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted March 5, 2019 Share Posted March 5, 2019 @myspacee You mean something like this? #include <StringConstants.au3> Global $strString = '<a href = "someurl">Someurl</a>' & @CRLF & _ '<div name = "somediv">Div Content </div>' ConsoleWrite("Before: " & $strString & @CRLF & _ "After : " & StringRegExpReplace($strString, '<div[^>]*>[^<]*</div[^>]*>', '') & @CRLF) Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
Dionysis Posted March 5, 2019 Share Posted March 5, 2019 13 minutes ago, myspacee said: This works. Thank you ! Another question : Is it possible to use StringRegExpReplace with words instead of single symbol ? eg: StringRegExpReplace($file_input_line, "(div.*?/div)", "") Thank you, m. Yes, it is possible. The regexp you got here will turn "<body><div><p>Some stuff</p></div></body>" into "<body><></body>" as it would remove the text I have in bold You can read the StringRegExp help file and the Regular Expression Tutorial for (a lot) more in depth info! Link to comment Share on other sites More sharing options...
myspacee Posted March 5, 2019 Author Share Posted March 5, 2019 Thank you, in fact, correcting syntax as suggested, solves also my word delimiters 'problem' .... Autoit command line tool is core of my PHP site: Thank you all for support, m. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now