buymeapc Posted September 17, 2012 Author Posted September 17, 2012 Ok, so I tried your code with the regex against the log I have (65MB) and it took about 42 seconds to search the entire log for the criteria specified in the variable. Utilizing the loops method took about 17 seconds.
buymeapc Posted September 18, 2012 Author Posted September 18, 2012 Ok, I mocked up a quick demo of what I'm working with. Since not too many people have logs that are huge, I added that to the demo code below. It will create a 20MB text file in which to read from. The process (without the text file creation) takes about 60 seconds to run. Is there any better method to search the huge array for case-sensitive search criteria including partials?expandcollapse popup#include <array.au3> #include <file.au3> $fileName = @ScriptDir&"\Test.txt"; <---Sample file name to use $sText = 'struct pPob1 "0000000000h 000000029142000000848300000000000000030000000003762500000000000000000000000003762500000000 01780010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"' & @CRLF & _ 'struct pPob2 01 "V 000000079160013033010000000000601400000019020000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"' & @CRLF & _ 'struct pPob2 02 "V 000000297090048917010000000002312800000065810000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"' & @CRLF & _ 'struct pAeb1 "87 104 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"' & @CRLF & _ 'struct pAeb_Dx "0000000000000000000000000000000000000000"' & @CRLF & _ 'struct pAeb_Op 1 "00613000000000000000000000000000000000000V 910000002000000100000000000"' & @CRLF & _ 'struct pAeb_Op 02 "00616000000000000000000000000000000000000V 910000002000000100000000000"' & @CRLF & _ 'ACE Edit V10, R4, EdtRC=00' & @CRLF & _ 'struct pPws1 "0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"' & @CRLF & _ 'struct pOob2 "0000000000000000000000000000000000000000000000000000000000000000000000000"' & @CRLF & _ 'struct pPaths->system "C:\Inetpub\wwwroot\HSS\Data"' & @CRLF & _ 'struct irec "010001 09 2010100100600000838700010000050000000000000050000178001100000000005000175000000000000000000000000100981000010981000000001FS2010 0800002000 0800002000AL 10000000001010200 0800002000NATIONAL 08000020001010200 0000000000 000000000000000000000000000000000000000000000000000000000000000000000000000020106 0000000000ALLEDITSOFF 000000000000000000002175000178000000 0000L999 09 20101001 apc010h NA000"' & @CRLF & _ '13:07:53 opcode=16, OptRC=00, APC, Group=55/55 V10/10, GrpRC=00, Price=h /h , PrcRC=00' & @CRLF ; If the test text file is less than 20MB, then write to it until it is greater than 20MB If FileGetSize($fileName) < 20971520 Then ; Open the file and write a ton of data to it to make it huge $hFile = FileOpen($fileName, 1) Do Sleep(10) FileWrite($hFile, $sText) Until FileGetSize($fileName) > 20971520; Write to it until it's greater than 20MB - This could take a while! FileClose($hFile) EndIf ; Read the file into an array since I'll be using that array to import into a virtual listview later Dim $aNewFile, $aRestore, $aRest, $fCase _FileReadToArray($fileName, $aNewFile) ; This array is the final output which will be the same size as the above array and will let us know which lines have a corresponding match or partial match Dim $aFinal[UBound($aNewFile)] $aFinal[0] = UBound($aNewFile)-1 ; Create the search criteria array $aRestore = _AddSearchDefaults() $time = TimerInit() For $x = 1 To $aFinal[0];disinclude this 3 lines if you dont need "No Match" textual statement $aFinal[$x] = "No Match" Next If IsArray($aRestore) Then For $s = 1 To $aRestore[0][0] $fCase = $aRestore[$s][1] $aRest = $aRestore[$s][0] For $t = 1 To $aNewFile[0] If StringInStr($aNewFile[$t], $aRest, $fCase) Then $aFinal[$t] = "---------->Match!" Next Next EndIf _ArrayDisplay($aFinal, TimerDiff($time)) Exit ; ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Func _AddSearchDefaults() ; This function creates an array of search criteria that is used to search the text file for Local $hiLiteDefaults[127][2], $iCount = 0 Local $aNumbers[127] = [126,"01","02","03","04","05","06","07","08","09","10","11","12","13","14","15","16","17","18","23","30","31","71","72","73","74","81","82","83","84", _ "88","89","90","95","96","97","98","99","01","02","05","06","07","08","09","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27", _ "28","46","62","70","87","02","03","04","05","06","07","08","09","10","11","12","60","61","87","88","89","95","01","02","03","04","05","06","07","15","16","17","18", _ "19","20","21","22","23","24","88","89","91","92","93","94","95","96","97","98","99","01","02","03","04","05","06","07","08","09","10","13","14","15","87"] ; Populate the array with search terms $hiLiteDefaults[0][0] = 126 For $x = 1 To 126 Switch $x Case 1 To 38 $hiLiteDefaults[$x][0] = 'struct pEcb "'&$aNumbers[$x] Case 39 To 69 $hiLiteDefaults[$x][0] = 'struct pOob1 "'&$aNumbers[$x] Case 70 To 86 $hiLiteDefaults[$x][0] = 'struct pAeb1 "'&$aNumbers[$x] Case 87 To 114 $hiLiteDefaults[$x][0] = 'struct pGob1 "'&$aNumbers[$x] Case 115 To 128 $hiLiteDefaults[$x][0] = 'struct pLeb1 "'&$aNumbers[$x] EndSwitch $hiLiteDefaults[$x][1] = 1 ; Make them all case sensitive for this example - this won't be the case outside of this example Next _ArrayDisplay($hiLiteDefaults, "Search Criteria") Return $hiLiteDefaults EndFuncHere is the topic I posted a while ago which is ultimately what I'm going for, but I couldn't figure out how to highlight text in a RichEdit control, which is why I switched to using a virtual listview instead. It was also much slower than a virtual listview, too. At least, this code was Thanks for all the help!
Spiff59 Posted September 21, 2012 Posted September 21, 2012 If I were to make some assumptions from looking at your example above, I could offer some useful speedups, but I'm afraid I'd fall prey to the ASS-U-ME trap again. I'm guessing that either of these assumptions would be incorrect: 1. Since every search key in the example above begins with "struct" they all will? 2. Since in the example in every line that does contain a matching search key, the the matching string begins in column 1 they always will? Lacking some pattern or rules regarding the content of the search keys or the location of the target data, I can't see an avenue for improving beyond the current brute-force method.
buymeapc Posted September 21, 2012 Author Posted September 21, 2012 1. Since every search key in the example above begins with "struct" they all will?Ok, I see what you mean.The search criteria could be anything. It doesn't have to start with "struct". It could be "ACE" as well. "Struct" was just what I had been using.2. Since in the example in every line that does contain a matching search key, the the matching string begins in column 1 they always will?The search criteria could be in any part of the line, not necessarily at the beginning. It could also be somewhere in the middle...which makes things rather difficult, of course.I'm basically trying to find whatever search criteria is in the $aRestore array within the text file that is being read into the $aNewFile array. I'm really only after the line number that has the search criteria in it. My ultimate goal is to display the text file in a GUI and highlight the lines that match the search criteria that is set by the user. So, if someone adds "hamburger" to the search array and it's found within one of the lines of the text file, that entire line is highlighted.I hope I'm explaining this right...Attached is a screenshot of something similar to what I'm trying to accomplish.
PhoenixXL Posted September 21, 2012 Posted September 21, 2012 since the data would most probably be Strings [if not this is useless] u cn use 26 arrays which have strings sorted in accordance to the starting alphabet when you search for a data get the first alphabet and search from the index like for searching of 'Hello' you should get data from $Data8 My code: PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners. MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.
czardas Posted September 21, 2012 Posted September 21, 2012 (edited) Okay, I think I get what you are trying to do - here are my thoughts on the subject: You do not need to search a string again after a match has already been found. So for each new search pattern you could avoid certain array elements on subsequent runs. Increases in speed will depend on the percentage of matches you expect to find. If it is a high number then the difference in speed should be greater. There are several ways to achieve this. I would shift matching elements to the end of the array. The first match found will be swapped with the final element Ubound(Array) -1, The second match swapped with Ubound(Array) -2 etc. The next time you loop through the array you only go as far as Ubound(Array) -$matches_Already_Found -1. This is a complicated proceedure and requires careful handling of loop iteration count, and determining when to quit a loop. Edited September 21, 2012 by czardas operator64 ArrayWorkshop
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now