SupGuvna Posted April 13, 2012 Share Posted April 13, 2012 I have wrote tools for pulling pages off of a range and dumping their source to a textfile. Example: For $i = $Start To $Finish $url = "http://DOMAIN.com/Pageid=" & $i $source = _INetGetSource($url) FileWrite("FILE.txt", $i & @CRLF) FileWrite("FILE.txt", $source & @CRLF) However, Now I want to make this into a tool that instead of just pulling the page source entirely, I want it to save ONLY the URL if a certain line of text is in the source of the page. How it is setup now, It just dumps the entire source of each page into a single textfile so I can just use notepads find function to find the pieces of text I want. I suppose it could be considered a type of crawler. But instead of just one thing being searched for...I would like it to search for say. Several phrases or lines. And only if that line of text exists in the sourcefile, I would like it to write the page ID number ($i) To the list. So...Can anybody help me with building something like this?.. It would help me out a lot. Sorry for the complicated explanation. but, I consider this complicated >< Link to comment Share on other sites More sharing options...
hannes08 Posted April 13, 2012 Share Posted April 13, 2012 Hello SupGuvna, you can use StringInStr() function or similar functions to check whether the string you're searching for is in the source. SupGuvna 1 Regards,Hannes[spoiler]If you can't convince them, confuse them![/spoiler] Link to comment Share on other sites More sharing options...
SupGuvna Posted April 13, 2012 Author Share Posted April 13, 2012 Hello SupGuvna, you can use StringInStr() function or similar functions to check whether the string you're searching for is in the source. experimented around with it quite abit...all I can seem to do is get it to dump the url to textfile along with "10" Any ideas?... $source = _INetGetSource($url) StringInStr($source, "HerroThere", 0, 1,0,0) Local $result = StringInStr("I am a String", "RING") FileWrite("test.txt", $result & @CRLF) Not sure if this is being properly used or what I am doing wrong. Not exactly an expert when it comes to this >< Link to comment Share on other sites More sharing options...
EndFunc Posted April 13, 2012 Share Posted April 13, 2012 (edited) experimented around with it quite abit...all I can seem to do is get it to dump the url to textfile along with "10" Any ideas?... $source = _INetGetSource($url) StringInStr($source, "HerroThere", 0, 1,0,0) Local $result = StringInStr("I am a String", "RING") FileWrite("test.txt", $result & @CRLF) Not sure if this is being properly used or what I am doing wrong. Not exactly an expert when it comes to this >< [/quote] Try this$source = _INetGetSource($url) $Str = StringInStr($source, "HerroThere") Local $result = StringMid($source, $Str) FileWrite("test.txt", $result & @CRLF) Edited April 13, 2012 by EndFunc SupGuvna 1 EndFuncAutoIt is the shiznit. I love it. Link to comment Share on other sites More sharing options...
SupGuvna Posted April 14, 2012 Author Share Posted April 14, 2012 Try this $source = _INetGetSource($url) $Str = StringInStr($source, "HerroThere") Local $result = StringMid($source, $Str) FileWrite("test.txt", $result & @CRLF) This works, But I was hoping it would write the var used instead of the results themselfs. Such as...it finds the line HerroThere in page 5784 Instead of writing the results, I want to make it write the page it was found in <3 Understand? Though, This is definitely a big step in the right direction. Link to comment Share on other sites More sharing options...
SupGuvna Posted April 14, 2012 Author Share Posted April 14, 2012 Here is the closest I can get... 8336 8337 HerroThere (Followed by the rest of the page source for some reason) 8338 Though thats by going with this route. FileWrite("test.txt", $i & @CRLF) FileWrite("test.txt", $result & @CRLF) Is it possible at all to write NOTHING to the text file with the exception of page ID`s via $i that have the string HerroThere in them? Sorry for making things complicated x-x Link to comment Share on other sites More sharing options...
SupGuvna Posted April 14, 2012 Author Share Posted April 14, 2012 This site could use an edit button..But anyways, I have gotten a step closer! For $i = $Start To $Finish $url = "http://www.Domain.com/pageid/" & $i $source = _INetGetSource($url) $Str = StringInStr($source, "HerroThere",0) $Main = ($i & " " & $Str) FileWrite("test.txt",$Main & @CRLF) Now the output is down to this! 8336 0 8337 1787 8338 0 8339 0 8340 0 Anybody got a way to push to the final step? <3 Almost there! Link to comment Share on other sites More sharing options...
Bowmore Posted April 14, 2012 Share Posted April 14, 2012 (edited) This should show you how you might achieve what you want Local $sUrl = "http://www.Domain.com/pageid/" Local $sFind = "HerroThere" Local $sSource = "" Local $iFirstPage = 1 Local $iLastPage = 20 For $i = $iFirstPage To $iLastPage $sSource = _INetGetSource($sUrl & $i) if StringInStr($sSource, $sFind,0) Then FileWriteLine("test.txt","Found " & $sFind & " on page " & $i & " of " & $sUrl) endif Next Edited April 14, 2012 by Bowmore SupGuvna 1 "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted April 14, 2012 Moderators Share Posted April 14, 2012 SupGuvna,This site could use an edit buttonNow you have 5 posts you should see one at bottom right. M23 SupGuvna 1 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
SupGuvna Posted April 14, 2012 Author Share Posted April 14, 2012 (edited) This should show you how you might achieve what you want Local $sUrl = "http://www.Domain.com/pageid/" Local $sFind = "HerroThere" Local $sSource = "" Local $iFirstPage = 1 Local $iLastPage = 20 For $i = $iFirstPage To $iLastPage $sSource = _INetGetSource($sUrl & $i) if StringInStr($sSource, $sFind,0) Then FileWriteLine("test.txt","Found " & $sFind & " on page " & $i & " of " & $sUrl) endif Next Unfortunately the code you wrote there always results in error. Played around with it abit and it is scanning, but nothing is being wrote to file. SupGuvna, Now you have 5 posts you should see one at bottom right. M23 Thanks <3 Edit: Messed around with the code and cleaned it up abit <3 Works just fine now. Thanks for the lovely education you guys! Edited April 14, 2012 by SupGuvna Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now