Trong Posted September 8, 2021 Share Posted September 8, 2021 (edited) I'm having trouble trying to collect data that's inside an html tag, I haven't found a solution to it yet. If you have ideas please help. Thank you Script: #include <String.au3> Global $HTML_Test $HTML_Test &= '<div class="accordion-item">' & @CRLF ; <!---- START GET--> $HTML_Test &= ' <div class="accordion-inner">' & @CRLF $HTML_Test &= ' <p>Khoá an toàn giúp bếp luôn được an toàn</p>' & @CRLF $HTML_Test &= ' </div>' & @CRLF $HTML_Test &= ' <a href="#" class="accordion-title plain">' & @CRLF $HTML_Test &= ' <button class="toggle">' & @CRLF $HTML_Test &= ' <i class="icon-angle-down"></i>' & @CRLF $HTML_Test &= ' </button>' & @CRLF $HTML_Test &= ' <span>Khoá an toàn</span>' & @CRLF $HTML_Test &= ' </a>' & @CRLF $HTML_Test &= '</div>' & @CRLF ;<!---- END GET --> Global $aSearch = _StringBetween($HTML_Test, '<div class="accordion-item">', '</div>') If IsArray($aSearch) Then For $i = 0 To UBound($aSearch) - 1 ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF) Next Else ConsoleWrite('! SB: No strings found. ' & @CRLF) EndIf Unexpected output: <div class="accordion-inner"> <p>Khoá an toàn giúp bếp luôn được an toàn</p> Input: <div class="accordion-item"> <div class="accordion-inner"> <p>Khoá an toàn giúp bếp luôn được an toàn</p> </div> <a href="#" class="accordion-title plain"> <button class="toggle"> <i class="icon-angle-down"></i> </button> <span>Khoá an toàn</span> </a> </div> Desired output: <div class="accordion-inner"> <p>Khoá an toàn giúp bếp luôn được an toàn</p> </div> <a href="#" class="accordion-title plain"> <button class="toggle"> <i class="icon-angle-down"></i> </button> <span>Khoá an toàn</span> </a> Edited September 8, 2021 by VIP Regards, Link to comment Share on other sites More sharing options...
Marc Posted September 8, 2021 Share Posted September 8, 2021 (edited) Hm, the cause seems to be simple: StringBetween is doing what it should - it returns the text between your <div class="accordion-item"> and the first </div> without including the search-texts itself. So you'd need to find another end-string or solve it via regex. in your case, you could #include <String.au3> Global $HTML_Test $HTML_Test &= '<div class="accordion-item">' & @CRLF ; <!---- START GET--> $HTML_Test &= ' <div class="accordion-inner">' & @CRLF $HTML_Test &= ' <p>Khoá an toàn giúp bếp luôn được an toàn</p>' & @CRLF $HTML_Test &= ' </div>' & @CRLF $HTML_Test &= ' <a href="#" class="accordion-title plain">' & @CRLF $HTML_Test &= ' <button class="toggle">' & @CRLF $HTML_Test &= ' <i class="icon-angle-down"></i>' & @CRLF $HTML_Test &= ' </button>' & @CRLF $HTML_Test &= ' <span>Khoá an toàn</span>' & @CRLF $HTML_Test &= ' </a>' & @CRLF $HTML_Test &= '</div>' & @CRLF ;<!---- END GET --> Global $aSearch = _StringBetween($HTML_Test, '<div class="accordion-item">', '</a>' & @CRLF & '</div>') If IsArray($aSearch) Then For $i = 0 To UBound($aSearch) - 1 ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF) Next Else ConsoleWrite('! SB: No strings found. ' & @CRLF) EndIf so you'll get the inner text. Of course the closing </a> gets lost because it is included in the $sEnd-String. Or with a regex: #include <String.au3> Global $HTML_Test $HTML_Test &= '<div class="accordion-item">' & @CRLF ; <!---- START GET--> $HTML_Test &= ' <div class="accordion-inner">' & @CRLF $HTML_Test &= ' <p>Khoá an toàn giúp bếp luôn được an toàn</p>' & @CRLF $HTML_Test &= ' </div>' & @CRLF $HTML_Test &= ' <a href="#" class="accordion-title plain">' & @CRLF $HTML_Test &= ' <button class="toggle">' & @CRLF $HTML_Test &= ' <i class="icon-angle-down"></i>' & @CRLF $HTML_Test &= ' </button>' & @CRLF $HTML_Test &= ' <span>Khoá an toàn</span>' & @CRLF $HTML_Test &= ' </a>' & @CRLF $HTML_Test &= '</div>' & @CRLF ;<!---- END GET --> ; Global $aSearch = _StringBetween($HTML_Test, '<div class="accordion-item">', '</a>' & @CRLF & '</div>') $aSearch = StringRegExp($HTML_Test, '(?s)<div class="accordion-item">(.*)</div>',3) If IsArray($aSearch) Then For $i = 0 To UBound($aSearch) - 1 ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF) Next Else ConsoleWrite('! SB: No strings found. ' & @CRLF) EndIf best regards, Marc Edited September 8, 2021 by Marc Trong 1 Any of my own codes posted on the forum are free for use by others without any restriction of any kind. (WTFPL) Link to comment Share on other sites More sharing options...
Trong Posted September 8, 2021 Author Share Posted September 8, 2021 Couldn't be more detailed in the div tag, to use _StringBetween(). And your Regex code is not working correctly either. Script with regex: #include <String.au3> Global $HTML_Test $HTML_Test &= '<div class="0">[' & @CRLF ; $HTML_Test &= '<code unknown code 1>' & @CRLF ; $HTML_Test &= '<div class="1">[' & @CRLF ; $HTML_Test &= '<code unknown code 2>' & @CRLF ; $HTML_Test &= '<div class="2 3 4">[' & @CRLF ; $HTML_Test &= '<code unknown code 3>' & @CRLF ; $HTML_Test &= '2]</div>' & @CRLF ; $HTML_Test &= '<code unknown code 4>' & @CRLF ; $HTML_Test &= '1]</div>' & @CRLF ; $HTML_Test &= '<code unknown code 5>' & @CRLF ; $HTML_Test &= '0]</div>' & @CRLF ; Global $rSearch = _BetweenString($HTML_Test, '<div class="1">', '</div>') ConsoleWrite('! ============================' & @CRLF & $rSearch & @CRLF & '! ============================' & @CRLF) Exit Func _BetweenString($iString, $iStart, $iEnd) Local $aSearch = StringRegExp($iString, '(?s)' & $iStart & '(.*)' & $iEnd, 3) If IsArray($aSearch) Then For $i = 0 To UBound($aSearch) - 1 ;ConsoleWrite('!-> SB Return: ' & $aSearch[$i] & @CRLF) If ($aSearch[$i] <> "") Then Return $aSearch[$i] Next Else ConsoleWrite('! SB: No strings found. ' & @CRLF) EndIf EndFunc ;==>_BetweenString Regards, Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now