Alodar Posted July 11, 2018 Share Posted July 11, 2018 So I've been dealing with a lot of xml files lately, and I end up converting them to csvs and the like, and then putting them into a database. Which is fine, but honestly, the process bothers me. I'm not quite up to writing my own xml parser, so obviously I'd want to use MSXML or something like that, but I'd like to get some advice on structure. Obviously, if this is something I should be picking up a book on, please tell me and if you're feeling generous, even tell me which book I should be reading, but what I'd like to get to is how to parse out an xml file into separate structures; i.e. the parent items and child items into different arrays (or variables?). This is the part I'm not sure about; how I should be parsing it out. I feel like the obvious answer is to parse it into a set of arrays with one being the 'parent' and the other(s?) being the child elements, which I would know ahead of time from the schema, but is there a better method that I'm just not aware of? Link to comment Share on other sites More sharing options...
mLipok Posted July 11, 2018 Share Posted July 11, 2018 Did you check this following thread: ? Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24 Link to comment Share on other sites More sharing options...
Alodar Posted July 11, 2018 Author Share Posted July 11, 2018 Yeah. I was more asking for what 'should' I do as a concept, rather than a 'how'. Dealing with an XML file in and of itself is ok, and the UDF listed would work great for it, I was just wondering if there was a better method than my original thought of creating different arrays for 'parent' and 'child' items. Eventually, the xml data I deal with will be going in to tables, so the most effective way to deal with it is kind of what I'm after. Link to comment Share on other sites More sharing options...
TheSaint Posted July 11, 2018 Share Posted July 11, 2018 Is the XML a source you have no control over? Personally, I dislike XML, and have recently come to grips with SQL which is far superior ... including great helper programs. Any XML I come across these days,I just read into an array and deal with it much faster that way ... but of course you need to know your source, and your needs be simplistic. Make sure brain is in gear before opening mouth! Remember, what is not said, can be just as important as what is said. Spoiler What is the Secret Key? Life is like a Donut If I put effort into communication, I expect you to read properly & fully, or just not comment. Ignoring those who try to divert conversation with irrelevancies. If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it. I'm only big and bad, to those who have an over-active imagination. I may have the Artistic Liesense to disagree with you. TheSaint's Toolbox (be advised many downloads are not working due to ISP screwup with my storage) Link to comment Share on other sites More sharing options...
Alodar Posted July 11, 2018 Author Share Posted July 11, 2018 Mostly it's that. I have control over it once I download it etc, but as for creating it from the source data or anything else, no, I have no control. It's a file sent to me that I then have to manipulate. Link to comment Share on other sites More sharing options...
mLipok Posted July 24, 2018 Share Posted July 24, 2018 What a specyfic format you have ? Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24 Link to comment Share on other sites More sharing options...
jchd Posted July 24, 2018 Share Posted July 24, 2018 Since you convert XML into CVS I understand your format is fixed and doesn't rely on variable structure XML allows. Maybe you could just parse the XML source using a regexp then directly populate your DB from the resulting array, all within AutoIt. Another avenue: eqsily convert fixed-format XML inJSON then directly import it into DB; SQLite handles that for instance. The drawback is then that the data is still encapsulated in JSON, but specific SQL[ite]functions permit selecting stuff in there as well, albeit not as easily compared to full-fledged SQL tables. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
TheDcoder Posted July 25, 2018 Share Posted July 25, 2018 4 hours ago, jchd said: eqsily convert fixed-format XML inJSON then directly import it into DB; SQLite handles that for instance. Does SQLite handle the converstion from XML to JSON as well? If not, how would we convert fixed-format XML to JSON? EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time) DcodingTheWeb Forum - Follow for updates and Join for discussion Link to comment Share on other sites More sharing options...
jchd Posted July 25, 2018 Share Posted July 25, 2018 "No" and "StringRegExp" resp. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
TheDcoder Posted July 25, 2018 Share Posted July 25, 2018 @jchd I'd not classify writing RegEx for converting XML to JSON "easy", I recently tried to do something similar (stripping elements from XML) using RegEx and even it was not too easy with some trivial bug . Unless someone has already written RegEx for converting XML -> JSON? EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time) DcodingTheWeb Forum - Follow for updates and Join for discussion Link to comment Share on other sites More sharing options...
jchd Posted July 25, 2018 Share Posted July 25, 2018 Remember we're talking of a (presumably) fixed-format XML source, convertible to pure CSV as a routine task. That means an array of series of similar entities. I'd wouldn't call "difficult" the task to convert from the first form to any of the next: XML <people> <firstname>Jimmy</firstname><lastname>Hendrix</lastname> <firstname>Jim</firstname><lastname>Morrison</lastname> <firstname>Frank</firstname><lastname>Zappa</lastname> <firstname>Robert</firstname><lastname>Wyatt</lastname> <firstname>John</firstname><lastname>Coltrane</lastname> </people> JSON { "people": [ { "firstname":"Jimmy", "lastname":"Hendrix" }, { "firstname":"Jim", "lastname":"Morrison" }, { "firstname":"Frank", "lastname":"Zappa" }, { "firstname":"Robert", "lastname":"Wyatt" }, { "firstname":"John", "lastname":"Coltrane" } ] } CSV "firstname","lastname" "Jimmy","Hendrix" "Jim","Morrison" "Frank","Zappa" "Robert","Wyatt" "John","Coltrane" SQL insert into People ("firstname", "lastname") values ('Jimmy', 'Hendrix'), ('Jim', 'Morrison'), ('Frank', 'Zappa'), ('Robert', 'Wyatt'), ('John', 'Coltrane') ; Very basic string manipulation indeed. TheDcoder and Earthshine 1 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
TheDcoder Posted July 25, 2018 Share Posted July 25, 2018 @jchd I see, it would be easy enough to process fixed-format XML I guess... Thanks for example EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time) DcodingTheWeb Forum - Follow for updates and Join for discussion Link to comment Share on other sites More sharing options...
jchd Posted July 25, 2018 Share Posted July 25, 2018 (edited) It's still plain that arbitrarily nested XML or JSON isn't a suitable source for CVS nor SQL table (both being equivalent to 2D arrays), but as far as I correctly understood OP's context, this isn't the situation he has to handle. Edited July 25, 2018 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Alodar Posted July 27, 2018 Author Share Posted July 27, 2018 Indded. What I have is a context with mutiple (variable number) of subset items within it that i need to split. e.g. <projects> <projectlist> <projectid>Project1</projectid> <tasks> <task>Code1</task> <taskinfo>blah blah</taskinfo> <taskinfo2>blah blah</taskinfo2> <taskinfo3>blah blah</taskinfo3> <task>Code2</task> <taskinfo>blah blah</taskinfo> <taskinfo2>blah blah</taskinfo2> <taskinfo3>blah blah</taskinfo3> </tasks> </projectlist> <projectlist> <projectid>Project2</projectid> <tasks> <task>Code1</task> <taskinfo>blah blah</taskinfo> <taskinfo2>blah blah</taskinfo2> <taskinfo3>blah blah</taskinfo3> <task>Code2</task> <taskinfo>blah blah</taskinfo> <taskinfo2>blah blah</taskinfo2> <taskinfo3>blah blah</taskinfo3> </tasks> </projectlist> </projects> And it's a huge list; 1000 items+. My current plan of using XML.au3 and then looping through each 'set' and placing it in arrays and then writing it to separate csvs (one for tasks and their info, and one for project id's and it's info - it has info the same way the tasks do) works, but it's a bit slow. Perhaps clunky? Dunno, but i can't think of a faster / smoother way to do it? Link to comment Share on other sites More sharing options...
jchd Posted July 27, 2018 Share Posted July 27, 2018 I see; the structure you get isn't as "flat" as you would want for a simple conversion. If your RDBMS offers support for JSON with functions to search the JSON tree, you could import the source convertedto JSON. Then use DB functions to populate regular tables from there. Alternatively, searching the XML by regexp to first get an array of projects, then for each project an array of tasks then for each task an array of steps. From there you can directly populate the SQL DB tables with their foreign keys to link data between entities. That would be my own choice. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Alodar Posted July 30, 2018 Author Share Posted July 30, 2018 I'd love to do that, but unfortunately I'm only allowed to submit csv files to it. Unless there's a faster way to read the xml (300mb), then I don't know what to do. I like the idea of using regex to read it line by line, but will that be qualitatively faster than using the XML dom? Link to comment Share on other sites More sharing options...
jchd Posted July 31, 2018 Share Posted July 31, 2018 Only real-world tests will say. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
kammses9 Posted January 1, 2019 Share Posted January 1, 2019 You may also use plugins for XML Conversion. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now