Gianni Posted January 17, 2015 Share Posted January 17, 2015 (edited) the little snipped below should: 1) open an html page containing 2 tables 2) extract from the HTML only the portions between <table and </table> (inclusive) 3) print on console the extracted portion but as you can see from the output generated on the console the first table extracted doesn't ends with the </table> tag, but it includes an extra portion of the string that follows the </table> tag #include <IE.au3> #include <String.au3> #include <Array.au3> ; ; 1) open an html page containing 2 tables Local $oie = _IE_Example("table") Do Sleep(250) Until IsObj($oie) Local $sHtml = _IEBodyReadHTML($oie) ; extract whole HTML ; ; finds how many tables are on the HTML page StringReplace($sHtml, "<table", "<table") ; in @xtended nr. of occurences Local $iNrOfTables = @extended ; ClipPut($sHtml) If $iNrOfTables Then ; if at least one table exists ; $aTablesPositions array will contain the position of the ; starting <table and ending </table> tags within the HTML Local $aTablesPositions[$iNrOfTables + 1][2] ; 1 based ; 2) extract from the HTML only the portions between <table and </table> (inclusive) For $i = 1 To $iNrOfTables $aTablesPositions[$i][0] = StringInStr($sHtml, "<table", 0, $i) ; start position of $i occurrence of <table $aTablesPositions[$i][1] = StringInStr($sHtml, "</table>", 0, $i) + 7 ; end position of the $i occurence of </table> ; 3) print on console the extracted portion ConsoleWrite("Table " & $i & @CRLF & "--------" & @CRLF) ConsoleWrite(StringMid($sHtml, $aTablesPositions[$i][0], $aTablesPositions[$i][1]) & @CRLF & "--------" & @CRLF) Next ; _ArrayDisplay($aTablesPositions) Else ConsoleWrite("No tables in HTML" & @CRLF) EndIf here a reduced portion of the output: Table 1 -------- <TABLE id=tableOne border=1> ........ <TD>aid</TD> <TD>of</TD></TR></TBODY></TABLE><BR>$oTableTwo = _IETableGetObjByName($oIE, "tableTwo")<BR><table border="1" id="tableTwo"> -------- Table 2 -------- <TABLE id=tableTwo border=1> <TBODY> ........ <TD>Ten</TD> <TD>Eleven</TD></TR></TBODY></TABLE> -------- where am I wrong? thanks for the help Edited January 17, 2015 by Chimp Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
JohnOne Posted January 17, 2015 Share Posted January 17, 2015 Want to fix that code or just use _StringBetween? #include <IE.au3> #include <String.au3> #include <Array.au3> Local $oie = _IE_Example("table") Do Sleep(250) Until IsObj($oie) Local $sHtml = _IEBodyReadHTML($oie) ; extract whole HTML If Not @error Then ; if at least one table exists $t = _StringBetween($sHtml, "<table", "</table") _ArrayDisplay($t) Else ConsoleWrite("No tables in HTML" & @CRLF & $sHtml & @CRLF) EndIf AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
TheSaint Posted January 17, 2015 Share Posted January 17, 2015 (edited) Off the top of my head, I'm thinking it is a @CRLF issue. Second thoughts, I just realized you are doing a position for your final Mid element, when it needs to be a calculation .... count from the first element. Subtract first from last I'm thinking. Edited January 17, 2015 by TheSaint Gianni 1 Make sure brain is in gear before opening mouth! Remember, what is not said, can be just as important as what is said. Spoiler What is the Secret Key? Life is like a Donut If I put effort into communication, I expect you to read properly & fully, or just not comment. Ignoring those who try to divert conversation with irrelevancies. If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it. I'm only big and bad, to those who have an over-active imagination. I may have the Artistic Liesense to disagree with you. TheSaint's Toolbox (be advised many downloads are not working due to ISP screwup with my storage) Link to comment Share on other sites More sharing options...
Gianni Posted January 17, 2015 Author Share Posted January 17, 2015 (edited) @JohnOne thanks JonOne for the simplification, but I would need to know the position of the tags within the HTML in my snippe those data should be in the $aTablesPositions array. I need that datas for the next step, that is to parse nested tables for example normal tables like this could be parsed with _StringBetween <table> </table> <table> </table> but with nested table _StringBetween fail parsing ** <table> <table> </table> </table> @TheSaint thanks for answer, .... not clear what you mean "count from the first element. Subtract first from last ". Anyway I do not understand why my first snipprt fails. edit: ** or even worst with mixed table, simple and nested <table> </table> <table> <table> </table> </table> <table> </table> Edited January 17, 2015 by Chimp Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
JohnOne Posted January 17, 2015 Share Posted January 17, 2015 Well In the StringInStr function there is a parameter for start position. So your search for end tag should begin at the return value of the position of start tag, to save time. Then the difference between start tag and end tag, is the count parameter you use in StringMid. Make sense? AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Gianni Posted January 17, 2015 Author Share Posted January 17, 2015 Well In the StringInStr function there is a parameter for start position. So your search for end tag should begin at the return value of the position of start tag, to save time. Then the difference between start tag and end tag, is the count parameter you use in StringMid. Make sense? no matter about the speed also this way would fail with mixed (normal and nested) tables I still do not understand why my first snippet fails... Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
Solution JohnOne Posted January 17, 2015 Solution Share Posted January 17, 2015 Here is your StringMid call StringMid($sHtml, $aTablesPositions[$i][0], $aTablesPositions[$i][1]) Start param is fine. Count is not, read again what I wrote above, count has to be the difference between both StringInStr results. Gianni 1 AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Gianni Posted January 17, 2015 Author Share Posted January 17, 2015 ah you are right! I was using second parameter as end position instead of count (what a shame ) (TheSaint also told in post #3) .... thanks both TheSaint 1 Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now