Sodori Posted December 9, 2014 Share Posted December 9, 2014 Hi all, I am really advancing in the arts of web scraping, but I still got issues where there is this class identities... they hate me.. http://proxyipchecker.com/, they have this really neat area of their page, where they show up last checked IPs. I wish to have them. I would love if anyone could be the sweetest and help me! I will demonstrate the interesting bit here for you: <div class="innertube"> <div class="innertube"> <div class="hovermenu"> <ul> <li><a href="/check-my-proxy-ip.html" title="Check my Proxy IP">Check my Proxy IP</a></li> <li><a href="/proxy-headers-checker.html" title="Proxy Headers Checker">Proxy Headers Checker</a></li> <li><a href="/proxy-checker-online.html" title="Proxy Checker Online">Proxy Checker Online</a></li> <li><a href="/buy-proxies-proxy-buy.html" title="Buy Proxies - Proxy Buy" style="background-color:#5c63ff;color:#ffffff">Buy Proxies - Proxy Buy</a></li> <li><a href="/api.html" title="Proxy Checker API - Proxy List API">Proxy Checker API - Proxy List API</a></li> </ul> </div> </div> <h2>Latest open proxy servers, fast, checked and alive! Fresh proxies IP address and port continuously updated!</h2> <ul class="freshproxies"> <li class="down">190.74.203.4 : 8080</li><li class="medium lowbw">118.26.142.5 : 80 </li><li class="medium lowbw">111.11.14.174 : 80 </li><li class="down">41.207.116.233 : 3128</li><li class="fast lowbw">195.40.6.43 : 8080 fresh</li><li class="fast lowbw">200.27.79.74 : 8080 open</li><li class="down">190.37.62.240 : 8080</li><li class="veryfast lowbw">77.243.2.171 : 80 up</li></ul> </div> Under "freshproxies" down at the slight bottom, you got a list of recent searches with their IP and port. I would like a simple code to fetch anything that's not related with "class=down" in an array. Anyone mind helping me with this? The code ought to be so simple I don't know if I really have to put up how I have faired in it. But I shall, case it humours you Local $oIE = _IECreate("http://proxyipchecker.com/") ;~ Local $fresh = _IEGetObjById($oIE, "rightcolumn") $tags = $oIE.document.GetElementsByTagName("li") For $tag in $tags $class_value = $tag.className("class") If $class_value = "freshproxies" Then ConsoleWrite($class_value & @LF) EndIf Next Thanks again! Link to comment Share on other sites More sharing options...
Solution computergroove Posted December 10, 2014 Solution Share Posted December 10, 2014 (edited) #include <array.au3> #include <File.au3> #include <String.au3> #include <IE.au3> Local $oIE = _IECreate("http://proxyipchecker.com/") WinWait("Online Proxy Checker - IP Checker - Check Proxy - Internet Explorer") Local $HTML = _IEDocReadHTML($oIE);Gets all HTML Local $LeftCount = StringInStr($HTML,'<ul class="freshproxies">');find the count of characters that come before the first string you want to find Local $temp = StringTrimLeft($HTML,$LeftCount + 25);removes all characters before the first ipaddress Local $RightLocation = StringInStr($temp,"</li></ul>");position of the end of the ip address section in the html Local $RawData = StringMid($temp,1,$RightLocation - 1);unedited datablock of ip address information Local $SplitRaw = StringSplit($RawData,'</li>',1) Local $TempArray[0][3] For $i = 1 To Ubound($SplitRaw) - 1 Local $M = StringReplace($SplitRaw[$i],'<li class="',"");remove leading text Local $N = StringReplace($M,'">',";");remove unwanted characters Local $O = StringReplace($N,":",";") _ArrayAdd($TempArray,$O,0,";") Next _ArrayDisplay($TempArray) This just needs the description removed and it is ready to use. Edit - For anyone who wants to chime in on this one there is a description that becomes part of the string behind the port number that sometimes does not show up at all (it's optional when entering the data in the website). I cannot figure out how to trim the description from my array. Edited December 10, 2014 by computergroove Sodori 1 Get Scite to add a popup when you use a 3rd party UDF -> http://www.autoitscript.com/autoit3/scite/docs/SciTE4AutoIt3/user-calltip-manager.html Link to comment Share on other sites More sharing options...
Sodori Posted December 10, 2014 Author Share Posted December 10, 2014 #include <array.au3> #include <File.au3> #include <String.au3> #include <IE.au3> Local $oIE = _IECreate("http://proxyipchecker.com/") WinWait("Online Proxy Checker - IP Checker - Check Proxy - Internet Explorer") Local $HTML = _IEDocReadHTML($oIE);Gets all HTML Local $LeftCount = StringInStr($HTML,'<ul class="freshproxies">');find the count of characters that come before the first string you want to find Local $temp = StringTrimLeft($HTML,$LeftCount + 25);removes all characters before the first ipaddress Local $RightLocation = StringInStr($temp,"</li></ul>");position of the end of the ip address section in the html Local $RawData = StringMid($temp,1,$RightLocation - 1);unedited datablock of ip address information Local $SplitRaw = StringSplit($RawData,'</li>',1) Local $TempArray[0][3] For $i = 1 To Ubound($SplitRaw) - 1 Local $M = StringReplace($SplitRaw[$i],'<li class="',"");remove leading text Local $N = StringReplace($M,'">',";");remove unwanted characters Local $O = StringReplace($N,":",";") _ArrayAdd($TempArray,$O,0,";") Next _ArrayDisplay($TempArray) This just needs the description removed and it is ready to use. Edit - For anyone who wants to chime in on this one there is a description that becomes part of the string behind the port number that sometimes does not show up at all (it's optional when entering the data in the website). I cannot figure out how to trim the description from my array. Big thanks! A bug inside port section, but I think I can manage it from here. Dare say it would be kinda handy with a "_IEGetObjectBy($type (class,name, ID etc), $oObject, $sName, $iIndex [optional])", but I digress. Perhaps they would have done it already if it were possible, who knows. Again, computergroove, thanks! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now