Champak Posted January 4, 2017 Posted January 4, 2017 Can someone make this better please? I need to speed it up and be a little more efficient. expandcollapse popup#include<Array.au3> #include<IE.au3> $IE = _IECreate("", 0, 1) _IENavigate($IE, "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=Dallas+-+North&area=Dallas+-+NW&area=Dallas+-+SE&area=Dallas+-+South&area=Dallas+-+SW&area=Dallas+-+West&site=NewYork,Dallas&station=Chevron&tme_limit=24") $44 = _IETableGetCollection( $IE ) For $33 In $44 Local $i = 0 Local $k = 3 ConsoleWrite("! " & $33.classname & @CRLF) If $33.classname = "p_v2" Then $TROb = _IETagNameGetCollection ( $33, "TR") $TR_Count = @extended Global $GasArray[$TR_Count][4] For $TRin In $TROb $TDOb = _IETagNameGetCollection ( $TRin, "TD") Local $j = 0 For $TDin In $TDOb If StringInStr($TDIn.innertext, "hour") Then ContinueLoop If $TDIn.innertext = "" Then ContinueLoop If $j = 0 Then $SplitMe = StringInStr($TDIn.innertext, " ", "", 1) $GasArray[$i][0] = StringLeft($TDIn.innertext, $SplitMe);==Station Name $GasArray[$i][1] = StringTrimLeft($TDIn.innertext, $SplitMe);==Station Address EndIf $GasArray[$i][2] = $TDIn.innertext;==County $Price = _IETagNameGetCollection ( $33, "TH", $k) $GasArray[$i][3] = StringReplace ( $Price.innertext, "Update", "");==Price ;ConsoleWrite("--=================================================" & @CRLF) $j += 1 Next $i += 1 $k += 1 Next EndIf Next _ArrayDisplay($GasArray) _IEQuit($IE) Exit
caramen Posted January 4, 2017 Posted January 4, 2017 @Champak Forum provide help ideas or links to better scripts but we wont write the script for you bro. My video tutorials : ( In construction ) || My Discord : https://discord.gg/S9AnwHw How to Ask Help || UIAutomation From Junkew || WebDriver From Danp2 || And Water's UDFs in the Quote Spoiler Water's UDFs:Active Directory (NEW 2018-10-19 - Version 1.4.10.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsPowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & SupportExcel - Example Scripts - WikiWord - Wiki Tutorials:ADO - Wiki
Nikolas92 Posted January 4, 2017 Posted January 4, 2017 (edited) 4 hours ago, caramen said: Forum provide help ideas ...in rare cases(if script is not complicated)... Edited January 4, 2017 by Nikolas92
Moderators JLogan3o13 Posted January 4, 2017 Moderators Posted January 4, 2017 (edited) 5 hours ago, caramen said: @Champak Forum provide help ideas or links to better scripts but we wont write the script for you bro. Where would you get that he is asking someone to write a script for him?! @Champak has been a contributor to the forum for some time, and is asking precisely that: for help ideas to better his (very obviously already written) script. Please refrain from stepping into a post if you have nothing of value to add. Edited January 4, 2017 by JLogan3o13 "Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball How to get your question answered on this forum!
Champak Posted January 4, 2017 Author Posted January 4, 2017 (edited) No where did I say write my code. I have my code. If you don't want to help, pass on. I said "I" need to make it better. Whether you're helpful and provide the whole code, a portion, example, or tell me what is slowing it down is fine, but don't try to play me and tell me what the forum is for. Edit: I see you addressed it at the same time JLogan3o13 Edited January 4, 2017 by Champak
mLipok Posted January 4, 2017 Posted January 4, 2017 Use queryselectorall Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24
caramen Posted January 4, 2017 Posted January 4, 2017 (edited) 13 hours ago, Champak said: Can someone make this better please? I need to speed it up and be a little more efficient. @JLogan3o13 Yeah i am not a moderator. This is somthing i know already. (it s my second warning about similare note) But if we translate corectly : "Can someone make this better" it s not a translation error even with my bad english. No one gonna make a script better for someone else are we true about that ? Well i dont want to polish the topic. @Champak Apologise then. Edit : Google translation : My goal is to get things done Nothing about rage Edited January 4, 2017 by caramen My video tutorials : ( In construction ) || My Discord : https://discord.gg/S9AnwHw How to Ask Help || UIAutomation From Junkew || WebDriver From Danp2 || And Water's UDFs in the Quote Spoiler Water's UDFs:Active Directory (NEW 2018-10-19 - Version 1.4.10.0) - Download - General Help & Support - Example Scripts - WikiOutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - WikiExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example ScriptsPowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & SupportExcel - Example Scripts - WikiWord - Wiki Tutorials:ADO - Wiki
Moderators JLogan3o13 Posted January 4, 2017 Moderators Posted January 4, 2017 So if it is your second warning, perhaps engage your brain before your typing fingers in the future "Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball How to get your question answered on this forum!
mLipok Posted January 4, 2017 Posted January 4, 2017 (edited) @Champak I will post here few examples, stay watching (when I done I say that) STEP 1: SCOPING + Manualy TIDY expandcollapse popup#include <Array.au3> #include <Array.au3> #include <IE.au3> Global $IE = _IECreate("", 0, 1) _IENavigate($IE, "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=Dallas+-+North&area=Dallas+-+NW&area=Dallas+-+SE&area=Dallas+-+South&area=Dallas+-+SW&area=Dallas+-+West&site=NewYork,Dallas&station=Chevron&tme_limit=24") Global $GasArray[$TR_Count][4] _Example() _IEQuit($IE) Exit Func _Example() Local $44 = _IETableGetCollection($IE) Local $i = 0 Local $k = 3 Local $j = 0 For $33 In $44 $i = 0 $k = 3 ConsoleWrite("! " & $33.classname & @CRLF) If $33.classname = "p_v2" Then $TROb = _IETagNameGetCollection($33, "TR") $TR_Count = @extended For $TRin In $TROb $TDOb = _IETagNameGetCollection($TRin, "TD") $j = 0 For $TDin In $TDOb If StringInStr($TDIn.innertext, "hour") Then ContinueLoop If $TDIn.innertext = "" Then ContinueLoop If $j = 0 Then $SplitMe = StringInStr($TDIn.innertext, " ", "", 1) $GasArray[$i][0] = StringLeft($TDIn.innertext, $SplitMe) ;==Station Name $GasArray[$i][1] = StringTrimLeft($TDIn.innertext, $SplitMe) ;==Station Address EndIf $GasArray[$i][2] = $TDIn.innertext ;==County $Price = _IETagNameGetCollection($33, "TH", $k) $GasArray[$i][3] = StringReplace($Price.innertext, "Update", "") ;==Price ;ConsoleWrite("--=================================================" & @CRLF) $j += 1 Next $i += 1 $k += 1 Next EndIf Next _ArrayDisplay($GasArray) EndFunc ;==>_Example STEP 2: SCOPING + VARIABLE RENAMING + SMALL REFACTORING expandcollapse popup#include <Array.au3> #include <Array.au3> #include <IE.au3> Global Enum _ $ARR_GAS_STATIONNAME, _ $ARR_GAS_STATIONADRESS, _ $ARR_GAS_COUNTY, _ $ARR_GAS_PRICE, _ $ARR_GAS_MAXBOUND Global $oIE = _IECreate("", 0, 1) _IENavigate($oIE, "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=Dallas+-+North&area=Dallas+-+NW&area=Dallas+-+SE&area=Dallas+-+South&area=Dallas+-+SW&area=Dallas+-+West&site=NewYork,Dallas&station=Chevron&tme_limit=24") Global $aGas[0][0] ; presetting - create 2D array _Example() _IEQuit($oIE) Exit Func _Example() Local $oGasRow_idx = 0, $iTH_idx = 3, $iTD_idx = 0, $iPosition_SplitMe = 0 Local $oTR_coll = Null, $oTD_coll = Null, $oPrice = Null Local $oTables_coll = _IETableGetCollection($oIE) For $oTable_enum In $oTables_coll $iTH_idx = 3 ConsoleWrite("! " & $oTable_enum.classname & @CRLF) If $oTable_enum.classname = "p_v2" Then $oTR_coll = _IETagNameGetCollection($oTable_enum, "TR") ReDim $aGas[0][0] ; CleanUp old Array values ReDim $aGas[@extended][$ARR_GAS_MAXBOUND] ; set proper size For $oTR_enum In $oTR_coll $oGasRow_idx = 0 $iTD_idx = 0 $oTD_coll = _IETagNameGetCollection($oTR_enum, "TD") For $oTD_enum In $oTD_coll If StringInStr($oTD_enum.innertext, "hour") Then ContinueLoop If $oTD_enum.innertext = "" Then ContinueLoop If $iTD_idx = 0 Then $iPosition_SplitMe = StringInStr($oTD_enum.innertext, " ", "", 1) $aGas[$oGasRow_idx][$ARR_GAS_STATIONNAME] = StringLeft($oTD_enum.innertext, $iPosition_SplitMe) ;==Station Name $aGas[$oGasRow_idx][$ARR_GAS_STATIONADRESS] = StringTrimLeft($oTD_enum.innertext, $iPosition_SplitMe) ;==Station Address EndIf $aGas[$oGasRow_idx][$ARR_GAS_COUNTY] = $oTD_enum.innertext ;==County $oPrice = _IETagNameGetCollection($oTable_enum, "TH", $iTH_idx) $aGas[$oGasRow_idx][$ARR_GAS_PRICE] = StringReplace($oPrice.innertext, "Update", "") ;==Price ;ConsoleWrite("--=================================================" & @CRLF) $iTD_idx += 1 Next $oGasRow_idx += 1 $iTH_idx += 1 Next EndIf Next _ArrayDisplay($aGas) EndFunc ;==>_Example STEP3: USING IEquerSelectorALL() expandcollapse popup#include <Array.au3> #include <Array.au3> #include <IE.au3> Global Enum _ $ARR_GAS_STATIONNAME, _ $ARR_GAS_STATIONADRESS, _ $ARR_GAS_COUNTY, _ $ARR_GAS_PRICE, _ $ARR_GAS_MAXBOUND Global $oIE = _IECreate("", 0, 1) _IENavigate($oIE, "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=Dallas+-+North&area=Dallas+-+NW&area=Dallas+-+SE&area=Dallas+-+South&area=Dallas+-+SW&area=Dallas+-+West&site=NewYork,Dallas&station=Chevron&tme_limit=24") Global $aGas[0][0] ; presetting - create 2D array _Example() _IEQuit($oIE) Exit Func _Example() Local $oGasRow_idx = 0, $iTH_idx = 3, $iTD_idx = 0, $iPosition_SplitMe = 0 Local $oTR_coll = Null, $oTD_coll = Null, $oPrice = Null ;~ Local $oTables_coll = _IETableGetCollection($oIE) ;~ For $oTable_enum In $oTables_coll ;~ $iTH_idx = 3 ;~ ConsoleWrite("! " & $oTable_enum.classname & @CRLF) ;~ If $oTable_enum.classname = "p_v2" Then ;~ $oTR_coll = _IETagNameGetCollection($oTable_enum, "TR") ;~ ReDim $aGas[0][0] ; CleanUp old Array values ;~ ReDim $aGas[@extended][$ARR_GAS_MAXBOUND] ; set proper size $oTR_coll = _IEQuerySelectorAll($oIE,'table[classname="p_v2"] > TR]') $oGasRow_idx = 0 ; fixed ?? with relatitio to previous example - should be ? before following For In loop. For $oTR_enum In $oTR_coll $iTH_idx = $oGasRow_idx +3 $iTD_idx = 0 $oTD_coll = _IETagNameGetCollection($oTR_enum, "TD") For $oTD_enum In $oTD_coll If StringInStr($oTD_enum.innertext, "hour") Then ContinueLoop If $oTD_enum.innertext = "" Then ContinueLoop If $iTD_idx = 0 Then $iPosition_SplitMe = StringInStr($oTD_enum.innertext, " ", "", 1) $aGas[$oGasRow_idx][$ARR_GAS_STATIONNAME] = StringLeft($oTD_enum.innertext, $iPosition_SplitMe) ;==Station Name $aGas[$oGasRow_idx][$ARR_GAS_STATIONADRESS] = StringTrimLeft($oTD_enum.innertext, $iPosition_SplitMe) ;==Station Address EndIf $aGas[$oGasRow_idx][$ARR_GAS_COUNTY] = $oTD_enum.innertext ;==County $oPrice = _IETagNameGetCollection($oTable_enum, "TH", $iTH_idx) $aGas[$oGasRow_idx][$ARR_GAS_PRICE] = StringReplace($oPrice.innertext, "Update", "") ;==Price ;ConsoleWrite("--=================================================" & @CRLF) $iTD_idx += 1 Next $oGasRow_idx += 1 ;~ $iTH_idx += 1 Next ;~ EndIf ;~ Next _ArrayDisplay($aGas) EndFunc ;==>_Example Reference to _IEquerySelectorAll: https://www.w3.org/TR/2012/WD-selectors4-20120823/ ....... DONE Remark: not tested - check it your self, as you should now see the concept. Edited January 4, 2017 by mLipok Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24
Gianni Posted January 4, 2017 Posted January 4, 2017 Another approach: Since it seems that the loading of that web page into the browser takes a good part of the total time needed to accomplish your task, I would suggest to download only the html code of the page so to avoid the waste of time spent by the browser to render images and advertisements, and then take data from the table. What you need is a table extractor from raw html code, and you can find it here: Here is an example (with the above udf already embedded at the bottom of the listing), seems quite faster than your way. Also, in this example the extracted table is as is on the web page, up to you to "clean" and tidy it and take only wanted columns from the returned array. expandcollapse popup; #include <_HtmlTable2Array.au3> ; <--- udf already included (hard coded) at bottom of this demo #include <array.au3> ; only for _arrayDisplay() Local $sPageSource = BinaryToString(InetRead("http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=Dallas+-+North&area=Dallas+-+NW&area=Dallas+-+SE&area=Dallas+-+South&area=Dallas+-+SW&area=Dallas+-+West&site=NewYork,Dallas&station=Chevron&tme_limit=24", 1)) Local $aTable = _HtmlTableGetWriteToArray($sPageSource, 4, False, 23) _ArrayDisplay($aTable) ; ------------------------------------------------------------------------ ; Following code should be included by the #include <_HtmlTable2Array.au3> ; hard coded here for easy load an run just to try theis example ; ------------------------------------------------------------------------ ; #include-once ; #include <array.au3> ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetList ; Description ...: Finds and enumerates all the html tables contained in an html listing (even if nested). ; if the optional parameter $i_index is passed, then only that table is returned ; Syntax ........: _HtmlTableGetList($sHtml[, $i_index = -1]) ; Parameters ....: $sHtml - A string value containing an html page listing ; $i_index - [optional] An integer value indicating the number of the table to be returned (1 based) ; with the default value of -1 an array with all found tables is returned ; Return values .: Success; Returns an 1D 1 based array containing all or single html table found in the html. ; element [0] (and @extended as well) contains the number of tables found (or 0 if no tables are returned) ; if an error occurs then an ampty string is returned and the following @error code is setted ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetList($sHtml, $i_index = -1) Local $aTables = _ParseTags($sHtml, "<table", "</table>") If @error Then Return SetError(@error, 0, "") ElseIf $i_index = -1 Then Return SetError(0, $aTables[0], $aTables) Else If $i_index > 0 And $i_index <= $aTables[0] Then Local $aTemp[2] = [1, $aTables[$i_index]] Return SetError(0, 1, $aTemp) Else Return SetError(4, 0, "") ; bad index EndIf EndIf EndFunc ;==>_HtmlTableGetList ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableWriteToArray ; Description ...: It writes values from an html table to a 2D array. It tries to take care of the rowspan and colspan formats ; Syntax ........: _HtmlTableWriteToArray($sHtmlTable[, $bFillSpan = False[, $iFilter = 0]]) ; Parameters ....: $sHtmlTable - A string value containing the html code of the table to be parsed ; $bFillSpan - [optional] Default is False. If span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: Success: 2D array containing data from the html table ; Faillure: An empty strimg and sets @error as following: ; @error: 1 - no table content is present in the passed HTML ; 2 - error while parsing rows and/or columns, (opening and closing tags are not balanced) ; 3 - error while parsing rows and/or columns, (open/close mismatch error) ; =============================================================================================================================== Func _HtmlTableWriteToArray($sHtmlTable, $bFillSpan = False, $iFilter = 0) $sHtmlTable = StringReplace(StringReplace($sHtmlTable, "<th", "<td"), "</th>", "</td>") ; th becomes td ; rows of the wanted table Local $iError, $aTempEmptyRow[2] = [1, ""] Local $aRows = _ParseTags($sHtmlTable, "<tr", "</tr>") ; $aRows[0] = nr. of rows If @error Then Return SetError(@error, 0, "") Local $aCols[$aRows[0] + 1], $aTemp For $i = 1 To $aRows[0] $aTemp = _ParseTags($aRows[$i], "<td", "</td>") $iError = @error If $iError = 1 Then ; check if it's an empty row $aTemp = $aTempEmptyRow ; Empty Row Else If $iError Then Return SetError($iError, 0, "") EndIf If $aCols[0] < $aTemp[0] Then $aCols[0] = $aTemp[0] ; $aTemp[0] = max nr. of columns in table $aCols[$i] = $aTemp Next Local $aResult[$aRows[0]][$aCols[0]], $iStart, $iEnd, $aRowspan, $aColspan, $iSpanY, $iSpanX, $iSpanRow, $iSpanCol, $iMarkerCode, $sCellContent Local $aMirror = $aResult For $i = 1 To $aRows[0] ; scan all rows in this table $aTemp = $aCols[$i] ; <td ..> xx </td> ..... For $ii = 1 To $aTemp[0] ; scan all cells in this row $iSpanY = 0 $iSpanX = 0 $iY = $i - 1 ; zero base index for vertical ref $iX = $ii - 1 ; zero based indexes for horizontal ref ; following RegExp kindly provided by SadBunny in this post: ; http://www.autoitscript.com/forum/topic/167174-how-to-get-a-number-located-after-a-name-from-within-a-string/?p=1222781 $aRowspan = StringRegExp($aTemp[$ii], "(?i)rowspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of rowspan If IsArray($aRowspan) Then $iSpanY = $aRowspan[0] - 1 If $iSpanY + $iY > $aRows[0] Then $iSpanY -= $iSpanY + $iY - $aRows[0] + 1 EndIf EndIf ; $aColspan = StringRegExp($aTemp[$ii], "(?i)colspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of colspan If IsArray($aColspan) Then $iSpanX = $aColspan[0] - 1 ; $iMarkerCode += 1 ; code to mark this span area or single cell If $iSpanY Or $iSpanX Then $iX1 = $iX For $iSpY = 0 To $iSpanY For $iSpX = 0 To $iSpanX $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf $iSpanCol = $iX1 + $iSpX If $iSpanCol > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf ; While $aMirror[$iSpanRow][$iX1 + $iSpX] ; search first free column $iX1 += 1 ; $iSpanCol += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf WEnd Next Next EndIf ; $iX1 = $iX ; following RegExp kindly provided by mikell in this post: ; http://www.autoitscript.com/forum/topic/167309-how-to-remove-from-a-string-all-between-and-pairs/?p=1224207 $sCellContent = StringRegExpReplace($aTemp[$ii], '<[^>]+>', "") If $iFilter Then $sCellContent = _HTML_Filter($sCellContent, $iFilter) For $iSpX = 0 To $iSpanX For $iSpY = 0 To $iSpanY $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf While $aMirror[$iSpanRow][$iX1 + $iSpX] $iX1 += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][$iX1 + $iSpX + 1] ReDim $aMirror[$aRows[0]][$iX1 + $iSpX + 1] EndIf WEnd $aMirror[$iSpanRow][$iX1 + $iSpX] = $iMarkerCode ; 1 If $bFillSpan Then $aResult[$iSpanRow][$iX1 + $iSpX] = $sCellContent Next $aResult[$iY][$iX1] = $sCellContent Next Next Next ; _ArrayDisplay($aMirror, "Debug") Return SetError(0, $aResult[0][0], $aResult) EndFunc ;==>_HtmlTableWriteToArray ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetWriteToArray ; Description ...: extract the html code of the required table from the html listing and copy the data of the table to a 2D array ; Syntax ........: _HtmlTableGetWriteToArray($sHtml[, $iWantedTable = 1[, $bFillSpan = False[, $iFilter = 0]]]) ; Parameters ....: $sHtml - A string value containing the html listing ; $iWantedTable - [optional] An integer value. The nr. of the table to be parsed (default is first table) ; $bFillSpan - [optional] Default is False. If all span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: success: 2D array containing data from the wanted html table. ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetWriteToArray($sHtml, $iWantedTable = 1, $bFillSpan = False, $iFilter = 0) Local $aSingleTable = _HtmlTableGetList($sHtml, $iWantedTable) If @error Then Return SetError(@error, 0, "") Local $aTableData = _HtmlTableWriteToArray($aSingleTable[1], $bFillSpan, $iFilter) If @error Then Return SetError(@error, 0, "") Return SetError(0, $aTableData[0][0], $aTableData) EndFunc ;==>_HtmlTableGetWriteToArray ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ParseTags ; Description ...: searches and extract all portions of html code within opening and closing tags inclusive. ; Returns an array containing a collection of <tag ...... </tag> lines. one in each element (even if are nested) ; Syntax ........: _ParseTags($sHtml, $sOpening, $sClosing) ; Parameters ....: $sHtml - A string value containing the html listing ; $sOpening - A string value indicating the opening tag ; $sClosing - A string value indicating the closing tag ; Return values .: success: an 1D 1 based array containing all the portions of html code representing the element ; element [0] af the array (and @extended as well) contains the counter of found elements ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _ParseTags($sHtml, $sOpening, $sClosing) ; example: $sOpening = '<table', $sClosing = '</table>' ; it finds how many of such tags are on the HTML page StringReplace($sHtml, $sOpening, $sOpening) ; in @xtended nr. of occurences Local $iNrOfThisTag = @extended ; I assume that opening <tag and closing </tag> tags are balanced (as should be) ; (so NO check is made to see if they are actually balanced) If $iNrOfThisTag Then ; if there is at least one of this tag ; $aThisTagsPositions array will contain the positions of the ; starting <tag and ending </tag> tags within the HTML Local $aThisTagsPositions[$iNrOfThisTag * 2 + 1][3] ; 1 based (make room for all open and close tags) ; 2) find in the HTML the positions of the $sOpening <tag and $sClosing </tag> tags For $i = 1 To $iNrOfThisTag $aThisTagsPositions[$i][0] = StringInStr($sHtml, $sOpening, 0, $i) ; start position of $i occurrence of <tag opening tag $aThisTagsPositions[$i][1] = $sOpening ; it marks which kind of tag is this $aThisTagsPositions[$i][2] = $i ; nr of this tag $aThisTagsPositions[$iNrOfThisTag + $i][0] = StringInStr($sHtml, $sClosing, 0, $i) + StringLen($sClosing) - 1 ; end position of $i^ occurrence of </tag> closing tag $aThisTagsPositions[$iNrOfThisTag + $i][1] = $sClosing ; it marks which kind of tag is this Next _ArraySort($aThisTagsPositions, 0, 1) ; now all opening and closing tags are in the same sequence as them appears in the HTML Local $aStack[UBound($aThisTagsPositions)][2] Local $aTags[Ceiling(UBound($aThisTagsPositions) / 2)] ; will contains the collection of <tag ..... </tag> from the html For $i = 1 To UBound($aThisTagsPositions) - 1 If $aThisTagsPositions[$i][1] = $sOpening Then ; opening <tag $aStack[0][0] += 1 ; nr of tags in html $aStack[$aStack[0][0]][0] = $sOpening $aStack[$aStack[0][0]][1] = $i ElseIf $aThisTagsPositions[$i][1] = $sClosing Then ; a closing </tag> was found If Not $aStack[0][0] Or Not ($aStack[$aStack[0][0]][0] = $sOpening And $aThisTagsPositions[$i][1] = $sClosing) Then Return SetError(3, 0, "") ; Open/Close mismatch error Else ; pair detected (the reciprocal tag) ; now get coordinates of the 2 tags ; 1) extract this tag <tag ..... </tag> from the html to the array $aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]] = StringMid($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0], 1 + $aThisTagsPositions[$i][0] - $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0]) ; 2) remove that tag <tag ..... </tag> from the html $sHtml = StringLeft($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0] - 1) & StringMid($sHtml, $aThisTagsPositions[$i][0] + 1) ; 3) adjust the references to the new positions of remaining tags For $ii = $i To UBound($aThisTagsPositions) - 1 $aThisTagsPositions[$ii][0] -= StringLen($aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]]) Next $aStack[0][0] -= 1 ; nr of tags still in html EndIf EndIf Next If Not $aStack[0][0] Then ; all tags where parsed correctly $aTags[0] = $iNrOfThisTag Return SetError(0, $iNrOfThisTag, $aTags) ; OK Else Return SetError(2, 0, "") ; opening and closing tags are not balanced EndIf Else Return SetError(1, 0, "") ; there are no of such tags on this HTML page EndIf EndFunc ;==>_ParseTags ; #============================================================================= ; Name ..........: _HTML_Filter ; Description ...: Filter for strings ; AutoIt Version : V3.3.0.0 ; Syntax ........: _HTML_Filter(ByRef $sString[, $iMode = 0]) ; Parameter(s): .: $sString - String to filter ; $iMode - Optional: (Default = 0) : removes nothing ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return Value ..: Success - Filterd String ; Failure - Input String ; Author(s) .....: Thorsten Willert, Stephen Podhajecki {gehossafats at netmdc. com} _ConvertEntities ; Date ..........: Wed Jan 27 20:49:59 CET 2010 ; modified ......: by Chimp Removed a double " " entities declaration, ; replace it with char(160) instead of chr(32), ; declaration of the $aEntities array as Static instead of just Local ; ============================================================================== Func _HTML_Filter(ByRef $sString, $iMode = 0) If $iMode = 0 Then Return $sString ;16 simple HTML tag / entities converter If $iMode >= 16 And $iMode < 32 Then Static Local $aEntities[95][2] = [[""", 34], ["&", 38], ["<", 60], [">", 62], [" ", 160] _ , ["¡", 161], ["¢", 162], ["£", 163], ["¤", 164], ["¥", 165], ["¦", 166] _ , ["§", 167], ["¨", 168], ["©", 169], ["ª", 170], ["¬", 172], ["­", 173] _ , ["®", 174], ["¯", 175], ["°", 176], ["±", 177], ["²", 178], ["³", 179] _ , ["´", 180], ["µ", 181], ["¶", 182], ["·", 183], ["¸", 184], ["¹", 185] _ , ["º", 186], ["»", 187], ["¼", 188], ["½", 189], ["¾", 190], ["¿", 191] _ , ["À", 192], ["Á", 193], ["Ã", 195], ["Ä", 196], ["Å", 197], ["Æ", 198] _ , ["Ç", 199], ["È", 200], ["É", 201], ["Ê", 202], ["Ì", 204], ["Í", 205] _ , ["Î", 206], ["Ï", 207], ["Ð", 208], ["Ñ", 209], ["Ò", 210], ["Ó", 211] _ , ["Ô", 212], ["Õ", 213], ["Ö", 214], ["×", 215], ["Ø", 216], ["Ù", 217] _ , ["Ú", 218], ["Û", 219], ["Ü", 220], ["Ý", 221], ["Þ", 222], ["ß", 223] _ , ["à", 224], ["á", 225], ["â", 226], ["ã", 227], ["ä", 228], ["å", 229] _ , ["æ", 230], ["ç", 231], ["è", 232], ["é", 233], ["ê", 234], ["ë", 235] _ , ["ì", 236], ["í", 237], ["î", 238], ["ï", 239], ["ð", 240], ["ñ", 241] _ , ["ò", 242], ["ó", 243], ["ô", 244], ["õ", 245], ["ö", 246], ["÷", 247] _ , ["ø", 248], ["ù", 249], ["ú", 250], ["û", 251], ["ü", 252], ["þ", 254]] $sString = StringRegExpReplace($sString, '(?i)<p.*?>', @CRLF & @CRLF) $sString = StringRegExpReplace($sString, '(?i)<br>', @CRLF) Local $iE = UBound($aEntities) - 1 For $x = 0 To $iE $sString = StringReplace($sString, $aEntities[$x][0], Chr($aEntities[$x][1]), 0, 2) Next For $x = 32 To 255 $sString = StringReplace($sString, "&#" & $x & ";", Chr($x)) Next $iMode -= 16 EndIf ;8 Tag filter If $iMode >= 8 And $iMode < 16 Then ;$sString = StringRegExpReplace($sString, '<script.*?>.*?</script>', "") $sString = StringRegExpReplace($sString, "<[^>]*>", "") $iMode -= 8 EndIf ; 4 remove all double cr, lf If $iMode >= 4 And $iMode < 8 Then $sString = StringRegExpReplace($sString, "([ \t]*[\n\r]+[ \t]*)", @CRLF) $sString = StringRegExpReplace($sString, "[\n\r]+", @CRLF) $iMode -= 4 EndIf ; 2 remove all double withespaces If $iMode = 2 Or $iMode = 3 Then $sString = StringRegExpReplace($sString, "[[:blank:]]+", " ") $sString = StringRegExpReplace($sString, "\n[[:blank:]]+", @CRLF) $sString = StringRegExpReplace($sString, "[[:blank:]]+\n", "") $iMode -= 2 EndIf ; 1 remove all non ASCII (remove all chars with ascii code > 127) If $iMode = 1 Then $sString = StringRegExpReplace($sString, "[^\x00-\x7F]", " ") EndIf Return $sString EndFunc ;==>_HTML_Filter Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....
mikell Posted January 4, 2017 Posted January 4, 2017 You might parse the source code using regex. Maybe hazardous but certainly much faster ...
mLipok Posted January 4, 2017 Posted January 4, 2017 (edited) 2 hours ago, mLipok said: I will post here few examples, stay watching (when I done I say that) Done. EDIT: 16 hours ago, Champak said: Can someone make this better please? I need to speed it up and be a little more efficient. I hope this is what you requested, as currently you are using much less FOR IN LOOP so the number of iteration is could rapidly decrased, and much more HTML Code is parsed by DOM which mean in MS AcitveX component >> in Native C++ (i think so). Edited January 4, 2017 by mLipok Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24
mikell Posted January 4, 2017 Posted January 4, 2017 (edited) For the fun #include<Array.au3> $t = TimerInit() $txt = BinaryToString(InetRead("http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=Dallas+-+North&area=Dallas+-+NW&area=Dallas+-+SE&area=Dallas+-+South&area=Dallas+-+SW&area=Dallas+-+West&site=NewYork,Dallas&station=Chevron&tme_limit=24", 1)) $res = StringRegExp($txt, '<[^>]*>(*SKIP)(?!)|[^\v<]+', 3) ; _ArrayDisplay($res) Local $s, $i, $u = UBound($res) While $i < $u $i += 1 If StringInStr($res[$i], "Thanks") Then ExitLoop Wend While $i < $u $i += 1 If StringStripWS($res[$i], 8) <> "" Then Exitloop Wend While $i < $u $s &= $res[$i] & "," & $res[$i+9] & "," & $res[$i+13] & "," & $res[$i+32] & @crlf $i += 49 If StringStripWS($res[$i], 8) = "" Then Exitloop Wend $elapsed = TimerDiff($t)/1000 Msgbox(0, $elapsed & " seconds", $s) Edited January 4, 2017 by mikell added a timer :)
Champak Posted January 5, 2017 Author Posted January 5, 2017 Thanks to all. Chimp, you are exactly right in the page loading taking most of the time, I just never knew there was a way around it. Your example is simple to understand and easy to duplicate elsewhere, I'm going to keep it as a quick go to for other more complicated tables when I need a quick remedy. Mikell, all I can say is bloody brilliant...in an english accent of course. I ran that code 6 times because I simply didn't believe it...I dont understand it right now, but I'm starting to believe it. I made a different script based on another gas station that changed up their website a while back, and because it took so long, a lot of times I never used it, especially with the connection speed in my car. Bet I'll always be using it now. There are a bunch of other IE codes I'm going to redo now with this new method. This should improve my overall app with the other parts when I'm done. Thanks again all.
Champak Posted January 5, 2017 Author Posted January 5, 2017 Spoke too soon. As much as I love mikell's and as fast as it is, it's not universal. It fails in all other station/city/state searches...like it was crafted specifically for this search. I'm going to have to dive into it when I have time and see what makes it tick to make it universal. For now, with some changes to chimp's example, that's the route I'll be going.
kylomas Posted January 5, 2017 Posted January 5, 2017 (edited) Champak, 1 hour ago, Champak said: like it was crafted specifically for this search Exactly why many insist that regexp should NOT be used as an HTML parser. kylomas P.S. See this for an amusing regexp rant. Edited January 5, 2017 by kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill
mikell Posted January 5, 2017 Posted January 5, 2017 2 hours ago, kylomas said: Exactly why many insist that regexp should NOT be used as an HTML parser. ..reason why I mentioned 'for the fun' , and called this 'hazardous' in post #11 It will never be an universal way As Champak noticed a tiny change in the source code can cause the whole thing to fail But... it's sooo fast I had no time enough to try other expressions and check the reliability but a regex way is (maybe) possible for this particular case - with the usual reservations of course !
mLipok Posted January 5, 2017 Posted January 5, 2017 In one of my script I use a REGEXP with about 20 frequently changing HTML elements, which are sometimes contain tables, so at least 4 times a year I have to modify REGEXP. Throughout the expression, I'm collecting 40 information, Some of them need further work (parsing). Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24
mLipok Posted January 5, 2017 Posted January 5, 2017 You can ask me: So why I'm using RegExp ? If I parse differently, and so I would have to change the script a few times a year, and besides, I agree with the following statement: 17 minutes ago, mikell said: But... it's sooo fast Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24
mikell Posted January 5, 2017 Posted January 5, 2017 (edited) Yes... IMHO I don't believe that we should systematically ban a method because of this of because of that Anyway a regex-based code is NOT a html parser BUT a text parser - which could sometimes work nice So. Another try. Looks much better, obviously I couldn't test the whole site but I did some random tests sucessfully - and it's still extremely fast Waiting now for Champak's feedback expandcollapse popup#include <Array.au3> ; other station/city/state $url1 = "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=Dallas+-+North&area=Dallas+-+NW&area=Dallas+-+SE&area=Dallas+-+South&area=Dallas+-+SW&area=Dallas+-+West&site=NewYork,Dallas&station=Chevron&tme_limit=24" $url2 = "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=TX&area=All%20Areas&station=BP&tme_limit=24" $url3 = "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=SC&area=Abbeville&site=NewYork,SouthCarolina&station=All%20Stations&tme_limit=24" $url4 = "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=SC&area=Aiken&site=NewYork,SouthCarolina&station=AAFES&tme_limit=24" $url5 = "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=SC&area=Anderson&site=NewYork,Greenville&station=BP&tme_limit=24" $url6 = "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=0&area=All%20Areas&station=Exxon&tme_limit=24" $url7 = "http://www.newyorkgasprices.com/GasPriceSearch.aspx?typ=adv&fuel=A&srch=1&state=IN&area=Anderson&site=NewYork,Indiana&station=All%20Stations&tme_limit=24" _Get($url2) ;================================================= Func _Get($url) $t = TimerInit() $txt = BinaryToString(InetRead($url, 1)) $txt2 = StringRegExpReplace($txt, '(?s).*class="p_v2.*?class="price_num[^>]+>(.*?)/table.*', "$1") If not @extended Then Exit Msgbox(0,"error", "No current gas prices reported") $res = StringRegExp($txt2, '(\s*<[^>]*>)(*SKIP)(?!)|[^\v<]+', 3) ; _ArrayDisplay($res) Local $s, $i, $u = UBound($res) While $i < $u-6 $line = $res[$i] & ", " & $res[$i+2] & ", " & StringReplace($res[$i+3], "amp;", "") & ", " & $res[$i+4] & @crlf ; Consolewrite($line) $i += 7 $s &= $line If StringStripWS($res[$i], 8) = "" Then Exitloop If not StringRegExp($res[$i], '\d\.\d\d') Then $i -= 1 Wend $elapsed = Round(TimerDiff($t)/1000, 3) Msgbox(0, $elapsed & " seconds", $s) EndFunc Edit unexpected error in the source code on last try => fix Edited January 5, 2017 by mikell
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now