sakej Posted July 1, 2019 Share Posted July 1, 2019 (edited) Hi guys Im in need of a script that will read all files in folder and look for a string in those files, files will have multiple lines. At the end script will display how many times string was found in all files. There will be around 300 files and expected string occurrences in those files is around 300 / 400k I did write some code for this and it works on test env with a couple of files that have couple of lines what I don't know is how efficient this will be in case mentioned earlier. So my request here is for you guys to look at my code and let me know will this be ok or there are some other ways to do this more efficiently. Here's the code #include <File.au3> #include <Array.au3> Global $bArray, $found = 0, $stringToLookFor = "PB11" look4stringInManyFiles() Func look4stringInManyFiles() $where2look4files = "C:\temp\test\" $aArray = _FileListToArray($where2look4files, "*.txt", 0, True) For $i = 1 To UBound($aArray) - 1 $path2file = $aArray[$i] _FileReadToArray($path2file, $bArray) For $a = 0 To UBound($bArray) - 1 If StringInStr($bArray[$a], $stringToLookFor) Then $found = $found + 1 Next Next MsgBox(0, "", $found&" instances of the string were found in all files") EndFunc ;==>look4stringInManyFiles Thanks! Edited July 1, 2019 by sakej typo Link to comment Share on other sites More sharing options...
BigDaddyO Posted July 1, 2019 Share Posted July 1, 2019 (edited) replace your For Loop with this one and you should be good. For $i = 1 to UBound($aArray) - 1 $hFile = FileOpen($aArray[$i]) $sContent = FileRead($hFile) FileClose($hFile) $sContent = StringReplace($sContent, $stringToLookFor, "") $found += @extended Next Edited July 1, 2019 by BigDaddyO Link to comment Share on other sites More sharing options...
sakej Posted July 1, 2019 Author Share Posted July 1, 2019 @BigDaddyO thanks for suggestion. I ran both version of code a couple times on my small scale test and your version actually always took longer to complete. Arrays did it in around 1.7 where opening and closing files needed around 2.4 I don't want to be smartass as I'm here asking for help but this makes me think that my initial approach will be better in this scenario. Anyone? Link to comment Share on other sites More sharing options...
Network_Guy Posted July 1, 2019 Share Posted July 1, 2019 weird i have "error: syntax error (illegal character)" from your function name "look4stringInManyFiles" and "$where2look4files" Link to comment Share on other sites More sharing options...
TheXman Posted July 1, 2019 Share Posted July 1, 2019 (edited) Why don't you just use one of the numerous GREP for Windows command line programs? Most have a switch that will provide just a count of matches. That would be much faster than any grep-like logic that you could create in AutoIt. Edited July 1, 2019 by TheXman Earthshine 1 CryptoNG UDF: Cryptography API: Next Gen jq UDF: Powerful and Flexible JSON Processor | jqPlayground: An Interactive JSON Processor Xml2Json UDF: Transform XML to JSON | HttpApi UDF: HTTP Server API | Roku Remote: Example Script About Me How To Ask Good Questions On Technical And Scientific Forums (Detailed) | How to Ask Good Technical Questions (Brief) "Any fool can know. The point is to understand." -Albert Einstein "If you think you're a big fish, it's probably because you only swim in small ponds." ~TheXman Link to comment Share on other sites More sharing options...
pixelsearch Posted July 1, 2019 Share Posted July 1, 2019 Hi sakej It seems that a RegExp approach gives a faster result. But one has to be very careful when launching several tests one after the other as the cache memory can still be filled with the precedent test result. #include <File.au3> Global $bArray, $found = 0, $stringToLookFor = "PB11" look4stringInManyFiles() Func look4stringInManyFiles() $where2look4files = "C:\temp\test\" $aArray = _FileListToArray($where2look4files, "*.txt", 1, True) For $i = 1 To UBound($aArray) - 1 $path2file = $aArray[$i] ;~ _FileReadToArray($path2file, $bArray) ;~ For $a = 0 To UBound($bArray) - 1 ;~ If StringInStr($bArray[$a], $stringToLookFor) Then $found = $found + 1 ;~ Next $sFileContent = FileRead($path2file) $cArray = StringRegExp($sFileContent, '(?i)' & $stringToLookFor, $STR_REGEXPARRAYGLOBALMATCH) If @error = 0 Then $found = $found + Ubound($cArray) Next MsgBox(0, "", $found & " instances of the string were found in all files") EndFunc ;==>look4stringInManyFiles Some remarks : => (?i) in RegExp makes the results case-insensitive (to match with your StringInStr() parameters) => changed one parameter from 0 to 1 in _FileListToArray() to return files only, not files + folders => in case the RegExp way brings a few more results : the explanation should be that "PB11" has been found more than once in a line, when StringInStr() ignored a 2nd occurence of "PB11" in the same line. One should probably add timers in the script and launch it (with, then without the RegExp way) at different times of the day, but certainly not testing both ways one after the other. Good luck Link to comment Share on other sites More sharing options...
Zedna Posted July 2, 2019 Share Posted July 2, 2019 (edited) Try this version, I think it should be fast ... $stringToLookFor = StringUpper($stringToLookFor) For $i = 1 to UBound($aArray) - 1 $hFile = FileOpen($aArray[$i], 0) ; mode=read $sContent = StringUpper(FileRead($hFile)) FileClose($hFile) StringReplace($sContent, $stringToLookFor, "", 0, 1) ; casesense=1 $found += @extended Next Edited July 2, 2019 by Zedna Resources UDF ResourcesEx UDF AutoIt Forum Search Link to comment Share on other sites More sharing options...
BigDaddyO Posted July 2, 2019 Share Posted July 2, 2019 15 hours ago, sakej said: @BigDaddyO thanks for suggestion. I ran both version of code a couple times on my small scale test and your version actually always took longer to complete. Arrays did it in around 1.7 where opening and closing files needed around 2.4 I don't want to be smartass as I'm here asking for help but this makes me think that my initial approach will be better in this scenario. Anyone? As the help file says, you will only see the improvement with doing a FileOpen on larger files. perhaps you should try testing on a few real files if can. I assumed they were large as you expect 300 - 400k finds per file. Link to comment Share on other sites More sharing options...
Earthshine Posted July 2, 2019 Share Posted July 2, 2019 now you guys have me adding grep to my ultra super fast file finder... lol My resources are limited. You must ask the right questions Link to comment Share on other sites More sharing options...
sakej Posted July 3, 2019 Author Share Posted July 3, 2019 Thank you all for your input, I'm really grateful for that. I'll try all ways and pickup the fastest (i don't need to carry much about resources). Link to comment Share on other sites More sharing options...
Nine Posted July 3, 2019 Share Posted July 3, 2019 @sakej Take a look at FindStr command line, which comes native with Windows... “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now