ermar Posted January 27, 2023 Share Posted January 27, 2023 Hi AspirinJunkie and thank you, But since yesterday I'm trying to make it work with my program 😞. My version Autoit is 3.3.14.5 and my feeling is this new version _xlsx_2Array() is not compatible with this. I have issue : error: MapAppend(): undefined function MapAppend($mSheets, $mSheet) when i run my program. Any ideas ? Thank you again. Regards Link to comment Share on other sites More sharing options...
water Posted January 27, 2023 Share Posted January 27, 2023 Maps are only supported in the latest AutoIt release. AspirinJunkie 1 My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
ermar Posted January 30, 2023 Share Posted January 30, 2023 Hi AspirinJunkie Indeed, that's what I thought, thanks Water. I took some time to migrate to this new autoit version to check that my program still works without too many side effects, it seems to be ok 🙂 Thank you very much, your program works very well. Is it difficult to add a complementary parameter that would allow to extract precise columns (ex: 4;7;8;15 etc...) and not continuous columns (ex: 5 to 15) ? Thanks for your time and your help. Regards Link to comment Share on other sites More sharing options...
AspirinJunkie Posted January 30, 2023 Author Share Posted January 30, 2023 5 minutes ago, ermar said: Is it difficult to add a complementary parameter that would allow to extract precise columns (ex: 4;7;8;15 etc...) and not continuous columns (ex: 5 to 15) ? Have a look at the function _ArraySlice() from >>this UDF<< Link to comment Share on other sites More sharing options...
ermar Posted January 31, 2023 Share Posted January 31, 2023 Thanks Link to comment Share on other sites More sharing options...
Dana Posted February 24, 2023 Share Posted February 24, 2023 (edited) I tried to use this but it's very slow, with or without 7za.exe. I have: #include <Array.au3> ;#include <_XLSXReadToArray.au3> #include <xlsxNative.au3> Local $a, $t, $d, $Error, $Extended Global $airportfile = @ScriptDir & "\all-airport-data.xlsx" ; https://adip.faa.gov/publishedAirports/all-airport-data.xlsx $t = TimerInit() ;Global $a = _XLSXReadToArray($airportfile, 0, 0, 0) $a = _xlsx_2Array($airportfile,1,1,3) $error = @error $d = TimerDiff($t) ConsoleWrite("Return = " & $a & "; @Error = " & $Error & @LF) ConsoleWrite("Timer = " & $d / 1000 & " seconds to read file" & @LF) _ArrayDisplay($a) ConsoleWrite("DONE") With the file from the commented link it takes 18 minutes to read the file, even trying to read only three lines. With an abridged version of the file (the original is 17MB, 98 columns and ~20k rows and ) it processes in under a second. XLSXReadToArray takes 24 seconds to read the big file, but for some reason drops the first line unless the file is opened in Excel and resaved first. Ultimately I only need to read 5 columns, but first I need to read the entire first line to get the column names and identify the columns I need. Edited February 24, 2023 by Dana Link to comment Share on other sites More sharing options...
AspirinJunkie Posted February 24, 2023 Author Share Posted February 24, 2023 1 hour ago, Dana said: With the file from the commented link it takes 18 minutes to read the file, even trying to read only three lines. Yep - that sounds plausible and I don't really see any way to change this much. There are a total of 737373 cells defined in the worksheet in the file. Each one in the form of an XML object. The worksheet XML is 46 MB and that has to be parsed first. The parser Microsoft.XMLDOM used in the UDF needs almost 3 minutes for this alone: $sFile = "sheet1.xml" $iT = TimerInit() Global $oXML = ObjCreate("Microsoft.XMLDOM") If Not $oXML.load($sFile) Then ConsoleWrite("error" & @CRLF) ConsoleWrite(TimerDiff($iT) & " ms" & @CRLF) And the whole data preparation for the transformation into an AutoIt array still follows. So I have to say: The amount of data is simply too large for this kind of approach. 1 hour ago, Dana said: XLSXReadToArray takes 24 seconds to read the big file There the parsing is purely handled by StringRegExp instead of a DOM parser. The approach seems to be faster than the XMLDOM parser. Who knows, maybe I'll find the time and motivation to implement a worksheet parser via StringRegExp and see if it can actually increase the performance that much. Link to comment Share on other sites More sharing options...
AspirinJunkie Posted February 27, 2023 Author Share Posted February 27, 2023 @Dana I've done the revision and completely rewrote the XML parser. The example file from you needed in the previous version for me >20 minutes (even more but I have then canceled). With the current version it is only 49s. That is already a massive leap forward I think. SOLVE-SMART 1 Link to comment Share on other sites More sharing options...
Dana Posted February 27, 2023 Share Posted February 27, 2023 Wow, thanks! That's a LOT faster... 4 seconds to read 3 lines and 56 seconds to read the entire file, 38 seconds to read a single column. Now to see how it does on a slower computer at home, and work out the best way to extract the 5 columns I need. Link to comment Share on other sites More sharing options...
ermar Posted March 13, 2023 Share Posted March 13, 2023 Hi, I test more thoroughly, and I find major differences on the last 2 versions: 0.6 and 0.7. the parameters are the same : _xlsx_2Array($TableExcel,"1","2",Default,"1",Default) This is a screenshot of the Excel file that i want to handle in the array : In the version 0.6, the first column is not reading but the following columns seems to be good : In the version 0.7, the first colomn is reading but for the following colomns the result is different with previous version : some values are missing (Column 3 : CSV fields) et some colomns where there are not values in the Excel file, we can see some numeric datas !? (column 1, 4 or 6 for examples), I don't understand. regards Eric Fichier_Test_Excel.xlsx Link to comment Share on other sites More sharing options...
AspirinJunkie Posted March 13, 2023 Author Share Posted March 13, 2023 Thank you - the file has exposed 2 cases that were not covered before. I have uploaded the new fixed version. Apart from that, the file is creepy. Data is only in the first 24 columns but there are 265 columns defined. This is probably due to the fact that someone has wildly scrolled through and first assigned a formatting to all these cells although there is nothing in it. Remove the columns and the handling of the file should be much smoother. dmob 1 Link to comment Share on other sites More sharing options...
ermar Posted March 14, 2023 Share Posted March 14, 2023 Great ! thanks Indeed, it is a legacy of some files I have to work with. This is why I wanted to be able to limit the display to the number of columns I wanted to define with the $dColTo parameter. 🙂 Link to comment Share on other sites More sharing options...
ermar Posted July 18, 2023 Share Posted July 18, 2023 Hi AspirinJunkie, With a tab containing several items of information, but separated by blank lines, udf ejects the program : issue : Would it be possible to add a control to prevent this from happening with an error, or to capture all the data? ? this is the test file. thanks for your time regards Eric test_issue.xlsx Link to comment Share on other sites More sharing options...
AspirinJunkie Posted July 19, 2023 Author Share Posted July 19, 2023 Are you using the latest version of the UDF from github? I can't reproduce your error with it. The following script: #include "xlsxNative.au3" $aSheet = _xlsx_2Array("test_issue.xlsx") _ArrayDisplay($aSheet) Results in the following output for me with your file: Link to comment Share on other sites More sharing options...
ermar Posted July 19, 2023 Share Posted July 19, 2023 (edited) I did have version 0.7.1, so I tried to compare it with the one I found on github and found this little difference : the only thing missing was the = and +$dRowFrom + 2 ! Thank you very much. Regards Edited July 19, 2023 by ermar Link to comment Share on other sites More sharing options...
ermar Posted February 5 Share Posted February 5 Hi AspirinJunkie, This time I'm interested in writing results in an Excel File with thid udf : _xlsx_WriteFromArray Is it possible to write several results in different tabs and rename it ? Using the example, I can see that the tab is called 1 Thanks Regards Link to comment Share on other sites More sharing options...
AspirinJunkie Posted February 5 Author Share Posted February 5 1 hour ago, ermar said: Is it possible to write several results in different tabs and rename it ? No it's not possible with this function. However, you can use the function as a template to understand the structure of an xlsx file and write your own function with the corresponding functionality based on this. ermar 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now