Kap Posted January 13, 2015 Share Posted January 13, 2015 Hello All, My appologies for the (sort of) double post, but I just realised it probably wasn't the best idea to ask my question in an answered topic ('?do=embed' frameborder='0' data-embedContent>>) The above mentioned topic helped me a great deal, ... but it doesn't do exactly what it suppose to do (or at least what I want it to do). And being new to array's I'm not quite sure where I go wrong.. It does create a .csv, every time I run the script it puts in another line, but it doesn't seem to find the info in the HTML/website (all it gives are 0's) So I suspect that the script doesn't read the site or don't seem to find info that I want. Been breaking my head over it all weekend, but can't seem to find where I gone wrong. My script: expandcollapse popupHotKeySet("{ESC}", "Terminate") Opt("WinTextMatchMode", 2) ;1=complete, 2=quick Opt("WinTitleMatchMode", 1) ;1=start, 2=subStr, 3=exact, 4=advanced, -1 to -4=Nocase AutoItSetOption("MouseCoordMode", 0) opt("SendKeyDelay",90) opt("WinWaitDelay",35) opt("TrayIconDebug",1) #include <IE.au3> #include <Inet.au3> #include <Array.au3> #include <String.au3> #include <MsgBoxConstants.au3> If FileExists("C:\Data\Auto ITs\check\check.csv") =false Then FileWrite("C:\Data\Auto ITs\check\check.csv","Actief;Lidstaat;nummer;Tijdstip waarop de aanvraag werd ontvangen;Naam;Adres;Cnummer"& @CRLF) EndIf $content = _INetGetSource("C:\Data\Auto ITs\check\Test.htm") $Status = _StringBetween($content, '<span class="validStyle">', "</span></b></td>") $Lidstaat = _StringBetween($content, '<td class="labelStyle">Lidstaat</td> <td>' , '</td>') $nr = _StringBetween($content, '<td class="labelStyle">nummer</td> <td>' , '</td>') $Tijd = _StringBetween($content, '<td class="labelStyle">Tijdstip waarop de aanvraag werd ontvangen</td> <td>' , '</td>') $Naam = _StringBetween($content, '<td class="labelStyle">Naam</td> <td>' , '</td>') $Adres= _StringBetween($content, '<td class="labelStyle">Adres</td> <td>' , '</td>') $Cnummer = _StringBetween($content, '<td class="labelStyle">Cnummer</td> <td>' , '</td>') $aio= $Status&";"&$Lidstaat&";"&$nr&";"&$Tijd&";"&$Naam&";"&$Adres&";"&$Cnummer $sString1 = StringReplace($aio, " ", "") ;removing spaces -to format it later to csv $sString2 = StringReplace($sString1, "<p>", "") ;removing <p> -useless $sString3 = StringReplace($sString2, "<span>Mobil:</span>", "") ;removing <span>Mobil:</span> -useless $sString4 = StringReplace($sString3, "</p>", "") ;removing </p> - useless $sString5 = StringReplace($sString4, "Â", "") ;removing  from m² $sString6 = StringReplace($sString5, '<spanclass="is24-operator">=</span>', "") ;removing <spanclass="is24-operator">=</span> -useless $sString7 = StringReplace($sString6, "EUR", "") ;removing EUR -useless cuz we will format it later in excel $sString8 = StringReplace($sString7, "<span>Telefon:</span>","") ;removing <span>Telefon:</span> -useless $sStringfinal = StringReplace($sString8, @CRLF, "") ;finally removing @CRLF to get a csv format FileWrite ( "check.csv", $sStringfinal & @CRLF ) Func Terminate() Exit 0 And the test HTML expandcollapse popup<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Test</title> </head> <body> <a id="top-page" name="top-page"></a> <div id="layout" class="layout"> <div id="header"> <h2>Info</h2> <fieldset> <table id="vatResponseFormTable"> <tr> <td class="labelLeft" colspan="3"><b><span class="validStyle">Ja, correct</span></b></td> </tr> <tr> <td><br /></td> </tr> <tr> <td class="labelStyle">Lidstaat</td> <td>NL</td> <td class="errorFormStyle"></td> </tr> <tr> <td class="labelStyle">nummer</td> <td>820471616gdwsg01</td> </tr> <tr> <td class="labelStyle">Tijdstip waarop de aanvraag werd ontvangen</td> <td>2015/01/12 12:28:03</td> </tr> <tr> <td class="labelStyle">Naam</td> <td>T. Est </td> </tr> <tr> <td class="labelStyle">Adres</td> <td><br />Straat 00189<br />1234AA Stad<br /> </td> </tr> <tr> <td class="labelStyle">Cnummer</td> <td></td> </tr> </table> <br /> <p><a href="backtest.html">Back</a></p> </fieldset> </div> </div> </div> </div> </div> </div> </body> </html> If somebody could point out where I gone wrong or send me in the right direction it would be greatly appreciated Thanks in advanced! -Kap Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted January 13, 2015 Moderators Share Posted January 13, 2015 Have you checked what the data looks like after the $content _InetGetSource()? Is it in binary or is it a regular string? Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
TheSaint Posted January 13, 2015 Share Posted January 13, 2015 I haven't finished looking at your code yet, and cannot at the moment run it, but yoiu might like to know there is a string command to remove spaces and characters like CRLF etc - StringStripWS Make sure brain is in gear before opening mouth! Remember, what is not said, can be just as important as what is said. Spoiler What is the Secret Key? Life is like a Donut If I put effort into communication, I expect you to read properly & fully, or just not comment. Ignoring those who try to divert conversation with irrelevancies. If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it. I'm only big and bad, to those who have an over-active imagination. I may have the Artistic Liesense to disagree with you. TheSaint's Toolbox (be advised many downloads are not working due to ISP screwup with my storage) Link to comment Share on other sites More sharing options...
draien Posted January 13, 2015 Share Posted January 13, 2015 (edited) Hi Kap, For me _INetGetSource only works for domains (e.g. www.google.com) not for local html documents. For local documents you could use: FileRead("C:\Data\Auto ITs\check\Test.htm") _StringBetween for $Lidstaat doesn't really work as you are trying to get the _StringBetween to jump a line (you have to edit some @CRLF and @TABS in there). I would recommend trying a complete different method, e.g. finding "<td>" (its the second one for the NL you want to have). Or get the linenumber of "Lidstaat", assign the line+1 to a variable. Strip the last 5 characters (</td>) and the first 8(? because of @TAB right?) characters, then you should have your NL. Also: _StringBetween returns an Array (0-based , look in the Help file ) So when assigning the contents you should write: $aio= $Status[0]&";"&$Lidstaat[0]&";"&$nr[0]&";"&$Tijd[0]&";"&$Naam[0]&";"&$Adres[0]&";"&$Cnummer[0] Edited January 13, 2015 by draien Link to comment Share on other sites More sharing options...
TheSaint Posted January 13, 2015 Share Posted January 13, 2015 (edited) Are you using msgbox's to test variable values at every instance they are created? I don't see any variable declarations (Global, etc), but they might just be missing from example. I don't see the required FileOpen preceding FileWrite. You could use _FileWriteLine instead. If you just want a pre-created blank file, use _FileCreate I personally prefer to use - If Not FileExists rather than your use of false. Some of the commands I mention, are in the UDF section of the Help file. Edited January 13, 2015 by TheSaint Make sure brain is in gear before opening mouth! Remember, what is not said, can be just as important as what is said. Spoiler What is the Secret Key? Life is like a Donut If I put effort into communication, I expect you to read properly & fully, or just not comment. Ignoring those who try to divert conversation with irrelevancies. If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it. I'm only big and bad, to those who have an over-active imagination. I may have the Artistic Liesense to disagree with you. TheSaint's Toolbox (be advised many downloads are not working due to ISP screwup with my storage) Link to comment Share on other sites More sharing options...
TheSaint Posted January 13, 2015 Share Posted January 13, 2015 @draien also seems to know what he/she is talking about and makes some good points ... some of which I haven't tried or have no experience with. Make sure brain is in gear before opening mouth! Remember, what is not said, can be just as important as what is said. Spoiler What is the Secret Key? Life is like a Donut If I put effort into communication, I expect you to read properly & fully, or just not comment. Ignoring those who try to divert conversation with irrelevancies. If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it. I'm only big and bad, to those who have an over-active imagination. I may have the Artistic Liesense to disagree with you. TheSaint's Toolbox (be advised many downloads are not working due to ISP screwup with my storage) Link to comment Share on other sites More sharing options...
draien Posted January 13, 2015 Share Posted January 13, 2015 (edited) Even though this is horrible scripted (I tried to fit in different ways to handle this), but it works for me: Edit the $content = _GetSource to your benefiting. If you want to get it via _INETGetSource then you have to call the function with $content = _GetSource("http://yourdomainhere.com",2) expandcollapse popupHotKeySet("{ESC}", "Terminate") Opt("WinTextMatchMode", 2) ;1=complete, 2=quick Opt("WinTitleMatchMode", 1) ;1=start, 2=subStr, 3=exact, 4=advanced, -1 to -4=Nocase AutoItSetOption("MouseCoordMode", 0) opt("SendKeyDelay",90) opt("WinWaitDelay",35) opt("TrayIconDebug",1) #include <IE.au3> #include <Inet.au3> #include <Array.au3> #include <String.au3> #include <MsgBoxConstants.au3> If FileExists(@ScriptDir & "\check.csv") =false Then FileWrite(@ScriptDir & "\check.csv","Actief;Lidstaat;nummer;Tijdstip waarop de aanvraag werd ontvangen;Naam;Adres;Cnummer"& @CRLF) EndIf $content = _GetSource(@ScriptDir & "\Test.htm") $start_Status = '<td class="labelLeft" colspan="3"><b><span class="validStyle">' $end_Status = '</span></b></td>' $Status = _StringBetween($content,$start_Status,$end_Status) $Lidstaat = _GrabValue($content,4) $nr = _GrabValue($content,7) $Tijd = _GrabValue($content,9) $Naam = _GrabValue($content,11) $Naam = StringReplace($Naam,@CRLF,"") $Adres = _GrabValue($content,13) $Adres = StringReplace($Adres,"<br />","") $Adres = StringReplace($Adres,@CRLF,"") $Cnummer = _GrabValue($content,15) $aio= $Status[0]&";"&$Lidstaat&";"&$nr&";"&$Tijd&";"&$Naam&";"&$Adres&";"&$Cnummer MsgBox(0,"",$aio) FileWrite ( "check.csv", $aio & @CRLF ) Func _GrabValue($sContent,$iOccurence) Local $start Local $end $start = StringInStr($sContent,"<td",0,$iOccurence) $start += 4 $end = StringInStr($content,"</td>",0,$iOccurence) Return StringMid($content,$start,$end - $start) EndFunc Func _GetSource($sHandle,$iMode=1) Switch $iMode Case 1 Return FileRead($sHandle) Case 2 Return _INetGetSource($sHandle) Case Else Return 0 EndSwitch EndFunc Func Terminate() Exit 0 EndFunc Edited January 13, 2015 by draien Kap 1 Link to comment Share on other sites More sharing options...
Kap Posted January 13, 2015 Author Share Posted January 13, 2015 Thanks draien! (and TheSaint too) I'll go trough your script to see what it does exactly One thing though, it doesn't work with me :/ I get an error I already seen a lot during my thinkering with this...(might by that that's part of my problem) "C:\Data\Auto ITs\BTW check\test2.au3" (34) : ==> Subscript used on non-accessible variable.: $aio= $Status[0]&";"&$Lidstaat&";"&$nr&";"&$Tijd&";"&$Naam&";"&$Adres&";"&$Cnummer $aio= $Status^ ERROR For some reason I get this error when I use [0] after my variables.. Since the code works with you, could it be I got a wrong liberary or missing something? I use v3.3.12.0 Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted January 13, 2015 Moderators Share Posted January 13, 2015 (edited) Did you run the consolewrite after _InetGetSource() like I suggested? Try this: $content = _INetGetSource("C:\Data\Auto ITs\check\Test.htm", True) Your data is coming out in binary form, you're searching for strings. Edited January 13, 2015 by SmOke_N Kap 1 Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
jdelaney Posted January 13, 2015 Share Posted January 13, 2015 Would be much easier to load up the html into a DOM object, and traverse that than regexp. It's very difficult to cover every possible html scenario via a regexp. IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window. Link to comment Share on other sites More sharing options...
draien Posted January 13, 2015 Share Posted January 13, 2015 Thanks draien! (and TheSaint too) I'll go trough your script to see what it does exactly One thing though, it doesn't work with me :/ I get an error I already seen a lot during my thinkering with this...(might by that that's part of my problem) "C:\Data\Auto ITs\BTW check\test2.au3" (34) : ==> Subscript used on non-accessible variable.: $aio= $Status[0]&";"&$Lidstaat&";"&$nr&";"&$Tijd&";"&$Naam&";"&$Adres&";"&$Cnummer $aio= $Status^ ERROR For some reason I get this error when I use [0] after my variables.. Since the code works with you, could it be I got a wrong liberary or missing something? I use v3.3.12.0 This error indicates that $Status is not an array. So lets debug here for a second: _StringBetween returns an array if something is found The variable is not an array The questions you have to ask yourself here is: Did _StringBetween found something? (no) Possible causes: Path/Url to the file is wrong. Did you edit this? $content = _GetSource(@ScriptDir & "\Test.htm") You have to edit @ScriptDir &"Test.htm" to your path of the htm (or copy-paste the htm in your Scriptdirectory)... I think it would be this: $content = _INetGetSource("C:\Data\Auto ITs\check\Test.htm") Link to comment Share on other sites More sharing options...
Kap Posted January 13, 2015 Author Share Posted January 13, 2015 @ SmOke_N Jup I've tried the consolewrite, it gives me the full HTML code of the page (even tried it in my full script. So with the real website, not the test HTML I posted above) Also put consolewrites here Local $start_Status = '<td class="labelLeft" colspan="3"><b><span class="validStyle">' consolewrite($start_Status & @crlf) Local $end_Status = '</span></b></td>' consolewrite( _StringBetween($content,$start_Status,$end_Status) & @crlf) Local $Status = _StringBetween($content,$start_Status,$end_Status) consolewrite($Status & @crlf) which gave these resultes: <td class="labelLeft" colspan="3"><b><span class="validStyle"> 0 0 "C:\Data\Auto ITs\BTW check\test.au3" (60) : ==> Subscript used on non-accessible variable.: Local $aio= $Status[0]&";"&$Lidstaat&";"&$nr&";"&$Tijd&";"&$Naam&";"&$Adres&";"&$Cnummer Local $aio= $Status^ ERROR So the _StringBetween doesn't find anything (0) (@draien also triedit with the test script and html same results and jup I did change the @ScriptDir ) But now I think of it I got the error also earlier when I started looking $aio= $Status[0]&";"&$Lidstaat&";"&$nr&";"&$Tijd&";"&$Naam&";"&$Adres&";"&$Cnummer$aio= $Status^ ERROR The first thing I did then was checking my AutoIT version (was V3.3.08 orsomething) and I updated it. Didn't help apperently (at least not with the error) I think I better start again and see if I make that work, to get the hang of it and to make sure that I'm not overlooking some simple thing (I kinda got the feeling it's something real small that I'm overlooking. Because I tried so much options that it turned a bit chaotic) (atm I made a quick and real dirty sollution to get the data I need with coordinated and winwaits etc. So I'm getting the data I need.) But I now know it should also be possible via this way Thanks all for all your help, tips and input! Link to comment Share on other sites More sharing options...
Valuater Posted January 13, 2015 Share Posted January 13, 2015 maybe try.. _ArrayDisplay($staus) or for $i = 1 to $status[0] -1 consolewrite($staus[$i] & @CRLF) next 8) Link to comment Share on other sites More sharing options...
Moderators Solution SmOke_N Posted January 13, 2015 Moderators Solution Share Posted January 13, 2015 I'm fairly sure your "string" data is not what you think it is. Run this example, if this works, then you need to change the approach to the string vs binary data that I've suggested already: #include <Array.au3> #include <String.au3> ; stringbetween returns an array ; checking it as a string return is not going to help you Local $sHTMLString = "<html>" & @CRLF $sHTMLString &= "<body>" & @CRLF $sHTMLString &= '<table id="mytable">' & @CRLF $sHTMLString &= "<tr>" & @CRLF $sHTMLString &= "<td>noting much</td>" & @CRLF $sHTMLString &= '<td class="labelLeft" colspan="3"><b><span class="validStyle">somedata here</span></b></td>' & @CRLF $sHTMLString &= "</tr>" & @CRLF $sHTMLString &= "</table>" & @CRLF $sHTMLString &= "</body>" & @CRLF $sHTMLString &= "</html>" Local $content = $sHTMLString Local $start_Status = '<td class="labelLeft" colspan="3"><b><span class="validStyle">' consolewrite($start_Status & @crlf) Local $end_Status = '</span></b></td>' Local $Status = _StringBetween($content,$start_Status,$end_Status) _ArrayDisplay($Status) ... Good luck Kap 1 Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Kap Posted January 13, 2015 Author Share Posted January 13, 2015 @SmOke_N Bingo, the example works. I get an array display pop up and no errors, and if I try your example script on my local test.htm I get 0 So I'm gonna check the string vs binary route! thanks!! Link to comment Share on other sites More sharing options...
Kap Posted January 14, 2015 Author Share Posted January 14, 2015 Huzzah! I'm getting data and it isn't 0 Global $Dcontent = FileRead("C:\Data\Auto ITs\check\Test.htm") Global $content = BinaryToString($Dcontent, 4) BinaryToString did the trick, thanks SmOke_N, draien and everybody else for the help, tips and pointers!! I've earned a lot. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now