Flaips Posted January 31, 2014 Share Posted January 31, 2014 So, I have around 900 html files stored on a folder, each file has a 4-digit number as its name, and inside on a line there is one line with a string of text, what I want to do is store the name of the file plus the string on a txt, like so: 1222 - String 1 1443 - String 2 Any help on that or is that even possible? Thank you all in advance. Link to comment Share on other sites More sharing options...
jchd Posted February 1, 2014 Share Posted February 1, 2014 Which "line"? The mere concept of line in html source is rather fuzzy. Flaips 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Flaips Posted February 1, 2014 Author Share Posted February 1, 2014 Which "line"? The mere concept of line in html source is rather fuzzy. When you open the .htm file with a text editor like Notepad++ on the line 87 of every file there is something like: <td>Example name</td> but each file has a different string between the <td> tags. I just want to save these particular lines(the tag can stay, but if it's possible to save without it, it would be great) together with the name of the file, like I explained earlier. I hope that this explains it all. Link to comment Share on other sites More sharing options...
michaelslamet Posted February 1, 2014 Share Posted February 1, 2014 Hi Flaipds, Yes, that is doable with AutoIT and in fact, it's not hard. Take a look at: _FileListToArray _StringBetween FileWriteLine or FileOpen and FileWrite Write your code and consult them here by posting it when you think you need help Good luck! Flaips 1 Link to comment Share on other sites More sharing options...
mikell Posted February 1, 2014 Share Posted February 1, 2014 Seems that this can be done using the simplest way Place the script in the same folder than the html files #include <String.au3> #include <Array.au3> Local $array[1000][2], $n = 0 $hSearch = FileFindFirstFile("*.html") While 1 $n += 1 $sFileName = FileFindNextFile($hSearch) If @error Then ExitLoop $array[$n][0] = StringReplace($sFileName, ".html", "") $array[$n][1] = _StringBetween(FileReadLine($sFileName, 87), '<td>', '</td>') Wend FileClose($hSearch) $array[0][0] = $n Redim $array[$n+1][2] _ArrayDisplay($array) Flaips 1 Link to comment Share on other sites More sharing options...
Flaips Posted February 1, 2014 Author Share Posted February 1, 2014 Hi Flaipds, Yes, that is doable with AutoIT and in fact, it's not hard. Take a look at: _FileListToArray _StringBetween FileWriteLine or FileOpen and FileWrite Write your code and consult them here by posting it when you think you need help Good luck! Thanks, I will have a look at that later, even if it's to understand what's going on. Seems that this can be done using the simplest way Place the script in the same folder than the html files #include <String.au3> #include <Array.au3> Local $array[1000][2], $n = 0 $hSearch = FileFindFirstFile("*.html") While 1 $n += 1 $sFileName = FileFindNextFile($hSearch) If @error Then ExitLoop $array[$n][0] = StringReplace($sFileName, ".html", "") $array[$n][1] = _StringBetween(FileReadLine($sFileName, 87), '<td>', '</td>') Wend FileClose($hSearch) $array[0][0] = $n Redim $array[$n+1][2] _ArrayDisplay($array) It did return me all the file names, which is a great start, like so: 956| 0997| With more 900 numbers of course, but for some reason it didn't return the names, only white spaces, so I tried to play around with code, and changed this: $array[$n][1] = _StringBetween(FileReadLine($sFileName, 87), '<td>', '</td>') to this: $array[$n][1] = _StringBetween(FileReadLine($sFileName, 87), ' <td>', '</td>') because of the white spaces on the line, and I don't know if that is necessary, but after that it returned 0, like so: 956| 0997|0 0998|0 Any clue as to what is going on? Sorry for bothering you, but thanks for the help. Link to comment Share on other sites More sharing options...
Malkey Posted February 2, 2014 Share Posted February 2, 2014 _StringBetween function returns an array. So, in your script you are storing an array in $array[$n][1] - an array in an array. If you are using The latest AutoIt release version 3.3.10.2, or the latest beta version then this should work. Note the trailing "[0]". $array[$n][1] = _StringBetween(FileReadLine($sFileName, 87), '<td>', '</td>')[0] Flaips 1 Link to comment Share on other sites More sharing options...
Solution Flaips Posted February 2, 2014 Author Solution Share Posted February 2, 2014 _StringBetween function returns an array. So, in your script you are storing an array in $array[$n][1] - an array in an array. If you are using The latest AutoIt release version 3.3.10.2, or the latest beta version then this should work. Note the trailing "[0]". $array[$n][1] = _StringBetween(FileReadLine($sFileName, 87), '<td>', '</td>')[0] Yup, that worked perfectly, thank you all for the help ^^. And in case anyone ever finds themselves with the same problem, this is the code I used: #include <String.au3> #include <Array.au3> Local $array[1000][2], $n = 0 $hSearch = FileFindFirstFile("*.htm") While 1 $n += 1 $sFileName = FileFindNextFile($hSearch) If @error Then ExitLoop $array[$n][0] = StringReplace($sFileName, ".htm", "") $array[$n][1] = _StringBetween(FileReadLine($sFileName, 87), '<td>', '</td>')[0] Wend FileClose($hSearch) $array[0][0] = $n Redim $array[$n+1][2] _ArrayDisplay($array) Thanks everyone for the help and support. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now