AlienStar Posted November 10, 2017 Share Posted November 10, 2017 hello everybody I wanna read this html file <div class="app-details"> <h2><a href="app/1010101010/show">test app</a></h2> <p class="app-abuse"><a href="https://support.twitter.com/articles/72585">Restricted from performing write actions</a></p> <p>testing</p> </div> I wanna get all lines between <div class="app-details"> </div> so I use this code to search #include <Array.au3> $Path ="test.html" $s_SearchString = '(?m)<div class="app-details">(.*?)</div>' $aResults = StringRegExp(FileRead($Path),$s_SearchString , $STR_REGEXPARRAYGLOBALFULLMATCH) _ArrayDisplay($aResults) where is the problem in this code please ?? Link to comment Share on other sites More sharing options...
rootx Posted November 10, 2017 Share Posted November 10, 2017 https://www.autoitscript.com/autoit3/docs/libfunctions/_StringBetween.htm AlienStar 1 Link to comment Share on other sites More sharing options...
mikell Posted November 10, 2017 Share Posted November 10, 2017 1 hour ago, AlienStar said: where is the problem in this code please ?? Mainly, the problem is : in the pattern newlines never match But it's a complicated way, it's much easier doing it in 2 steps #include <Array.au3> $Path ="test.html" ; first get the concerned part of text $part = StringRegExpReplace(FileRead($Path), '(?s).*<div class="app-details">(.*?)</div>.*' , "$1") ; then get non-empty-or-blank lines $aResults = StringRegExp($part, '\S\N+', 3) _ArrayDisplay($aResults) AlienStar 1 Link to comment Share on other sites More sharing options...
AlienStar Posted November 10, 2017 Author Share Posted November 10, 2017 (edited) 56 minutes ago, mikell said: Mainly, the problem is : in the pattern newlines never match But it's a complicated way, it's much easier doing it in 2 steps #include <Array.au3> $Path ="test.html" ; first get the concerned part of text $part = StringRegExpReplace(FileRead($Path), '(?s).*<div class="app-details">(.*?)</div>.*' , "$1") ; then get non-empty-or-blank lines $aResults = StringRegExp($part, '\S\N+', 3) _ArrayDisplay($aResults) thanks so much it works well but I have many patterns like that in this html file <div class="app-details"> <h2><a href="app/1010101010/show">test app</a></h2> <p class="app-abuse"><a href="https://support.twitter.com/articles/72585">Restricted from performing write actions</a></p> <p>testing</p> </div> <div class="app-details"> <h2><a href="app/101353/show">test1 app</a></h2> <p class="app-abuse"><a href="https://support.twitter.com/articles/72585">Restricted from performing write actions</a></p> <p>testing1</p> </div> how to get them all please ?? Edited November 10, 2017 by AlienStar Link to comment Share on other sites More sharing options...
AlienStar Posted November 10, 2017 Author Share Posted November 10, 2017 1 hour ago, rootx said: https://www.autoitscript.com/autoit3/docs/libfunctions/_StringBetween.htm thanks so much it works but doesn't get all content through the pattern #include <Array.au3> #include <String.au3> $Path ="test.html" Local $aResults = _StringBetween(FileRead($Path), '<div class="app-details">', '</div>',$STR_ENDNOTSTART) _ArrayDisplay($aResults) Link to comment Share on other sites More sharing options...
mikell Posted November 11, 2017 Share Posted November 11, 2017 8 hours ago, AlienStar said: how to get them all please ?? Uncommon way : One Ring (To Get Them All) Most usual way : make the first step to get the wanted parts of text into an array then loop through this array #include <Array.au3> $Path ="test.html" $parts = StringRegExp(FileRead($Path), '(?s)<div class="app-details">(.*?)</div>' , 3) Local $aResults[0] For $i = 0 to UBound($parts)-1 $lines = StringRegExp($parts[$i], '\S\N+', 3) _ArrayAdd($aResults, $lines) Next _ArrayDisplay($aResults) AlienStar 1 Link to comment Share on other sites More sharing options...
AlienStar Posted November 11, 2017 Author Share Posted November 11, 2017 (edited) 13 hours ago, mikell said: Uncommon way : One Ring (To Get Them All) Most usual way : make the first step to get the wanted parts of text into an array then loop through this array #include <Array.au3> $Path ="test.html" $parts = StringRegExp(FileRead($Path), '(?s)<div class="app-details">(.*?)</div>' , 3) Local $aResults[0] For $i = 0 to UBound($parts)-1 $lines = StringRegExp($parts[$i], '\S\N+', 3) _ArrayAdd($aResults, $lines) Next _ArrayDisplay($aResults) thanks so much that what I wanna Edited November 11, 2017 by AlienStar Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now