michaelslamet Posted May 19, 2013 Share Posted May 19, 2013 I have array1 that contain about 5000 records and growing. I have another array (array2) that contain also about 5000 records and keep growing too. For each record on array2, I search it againts array1. If found that I'm doing some operation on it. This is the code. Is there any posibilities to improve the searching speed? Sort the array and use _ArrayBinarySearch maybe? Thank you in advance! For $n = 0 to Ubound($array1)-1 $array1_split = StringSplit($array1[$n], "|") $what_to_search = $array1_split[1] $search_result_index = _ArraySearch($array2, $what_to_search, 0, 0, 0, 1) ; for each array1, search on array2, not case sensitive If Not @error Then ; if found then ; doing something here EndIf Next Link to comment Share on other sites More sharing options...
water Posted May 19, 2013 Share Posted May 19, 2013 Are the arrays sorted? My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
michaelslamet Posted May 19, 2013 Author Share Posted May 19, 2013 To water: no, they are not sorted. Link to comment Share on other sites More sharing options...
water Posted May 19, 2013 Share Posted May 19, 2013 Would it be a problem to sort them? I think searching would be much faster then. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
UEZ Posted May 19, 2013 Share Posted May 19, 2013 Or use a database instead. Br, UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
michaelslamet Posted May 19, 2013 Author Share Posted May 19, 2013 @Water: no problem to sort them. Is there anybody here ever compare non-sorted array + _ArraySearch versus sorted-array + _ArrayBinarySearch? @UEZ: actually I read those Array1 and Array2 data from different MySQL database (one on LAN and another on internet) and then put them into array Doing wrong? Link to comment Share on other sites More sharing options...
UEZ Posted May 19, 2013 Share Posted May 19, 2013 If you compare array 1 against a non sorted array 2 then you need O(n^2) whereas searching within a sorted array using _ArrayBinarySearch() should to in O(logn). Sorting an array using Quicksort takes also (n*logn). More info at Wikipedia (section Orders of common functions) That means sorting and comparing should give you a benefit. Br, UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
kylomas Posted May 19, 2013 Share Posted May 19, 2013 michaelslamet, Could you get the DB to do the work for you? Something like: 1 - create the array of search arguments 2 - select * from "data to search" where "some column" =/like "search argument" kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
water Posted May 19, 2013 Share Posted May 19, 2013 Even an Excel workbook can be accessed like a database. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
kylomas Posted May 19, 2013 Share Posted May 19, 2013 (edited) michaelslamet, Can you post an example of what $array1 and $array2 look like? kylomas edit: You can adapt a technique similar to the following. This searches a 5000 element array using the first "|" delimited string from another 5000 element array as the search argument. The search time is appx. .13 seconds (all 5000 search elements against the searched array). Although, It makes more sense to me to get the DB itself to do the work. expandcollapse popup#include<array.au3> ; generate array1 5000 elements with each element containing 1 to 20 "|" delimited strings of 3 to 10 uppercase characters local $array1[5000], $array2[5000], $str for $1 = 0 to ubound($array1) - 1 for $2 = 1 to random(1,20,1) for $3 = 1 to random(3,10,1) $str &= chr(random(65,90,1)) Next $str &= '|' Next $array1[$1] = $str $str = '' Next ; generate array2 with 5000 elements with each element containing 3 to 10 uppercase chars $str = '' for $1 = 0 to ubound($array2) - 1 for $2 = 1 to random(3,10,1) $str &= chr(random(65,90,1)) Next $array2[$1] = $str $str = '' Next local $st = timerinit() ; create a varname = to the value of each element in array2 prepended with "a" for $1 = 0 to ubound($array2) - 1 assign("a" & $array2[$1],1) next ; test if the first "|" delimited value of array 1 exists (is declared) and write output for $1 = 0 to ubound($array1) - 1 $aTmp = stringsplit($array1[$1],'|') if isdeclared(eval("a" & $aTMP[1])) then ; you would do your thing here... ConsoleWrite(stringformat('Hit at row = %04i value = %-20s',$1,$aTMP[1]) & @LF) EndIf next ConsoleWrite(stringformat('Time to search array1 against array2 = %2.4f seconds',timerdiff($st)/1000) & @LF) Good Luck, kylomas Edited May 19, 2013 by kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
kylomas Posted May 19, 2013 Share Posted May 19, 2013 (edited) michaelslamet, This version of what I posted above is a little more "user friendly". It is also quicker (does all 5000 searches in appx. .06 seconds) expandcollapse popup#include<array.au3> #include<userudfs.au3> ; generate array1 5000 elements with each element containing 1 to 20 "|" delimited strings of 5 uppercase characters local $st = timerinit() local $array1[5000], $array2[5000], $str for $1 = 0 to ubound($array1) - 1 for $2 = 1 to random(1,20,1) for $3 = 1 to 5 $str &= chr(random(65,90,1)) Next $str &= '|' Next $array1[$1] = $str $str = '' Next ; generate array2 with 5000 elements with each element containing 5 uppercase chars $str = '' for $1 = 0 to ubound($array2) - 1 for $2 = 1 to 5 $str &= chr(random(65,90,1)) Next $array2[$1] = $str $str = '' Next ConsoleWrite(stringformat('> %-40s %2.4f seconds','Time to generate arrays ' ,timerdiff($st)/1000) & @LF) ; initialize PsuedoSearch with array to search $st = timerinit() _PsuedoSearchInit($array2) ConsoleWrite(stringformat('> %-40s %2.4f seconds','Time to initialize PsuedoSearch ' ,timerdiff($st)/1000) & @LF) ; find all occurences of the first '|' delimited string from the source array local $sSearchArg = '', $st = timerinit() for $1 = 0 to ubound($array1) - 1 $sSearchArg = stringleft($array1[$1],stringinstr($array1[$1],'|')-1) if _PsuedoSearch($sSearchArg) then ; you would do your thing here... ConsoleWrite(stringformat('Hit at row = %04i value = %-20s',$1,$sSearchArg) & @LF) EndIf next ConsoleWrite(stringformat('> %-40s %2.4f seconds','Total time for search ',timerdiff($st)/1000) & @LF) func _PsuedoSearchInit(byref $aTargetArray) for $1 = 0 to ubound($aTargetArray) - 1 assign("a" & $aTargetArray[$1],1,2) next endfunc func _PsuedoSearch($str) if isdeclared("a" & $str) then return true Else return false endif endfunc kylomas edit: corrected isdeclared statement Edited May 19, 2013 by kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
jchd Posted May 19, 2013 Share Posted May 19, 2013 Another possible route: read the table (or query resultset) from the remote DB into a local array, import it to the local DB in a separate table then work on the lan DB with SQL. I don't believe you can attach both tables to avoid extracting one of the table. Might be faster than copying both tables into arrays then work painfully from there, specially if your tables are expected to grow: SQL will scale gracefully and offer all power you may need to process your data. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
michaelslamet Posted May 20, 2013 Author Share Posted May 20, 2013 (edited) @kylomas: thanks a lot for the code example. Really appreciated it I have tried that, but on the "doing something here" part is i need to StringSplit a string which is the index I got from _ArraySearch This is my code: For $n = 0 to Ubound($array1)-1 $array1_split = StringSplit($array1[$n], "|") $what_to_search = $array1_split[1] $search_result_index = _ArraySearch($array2, $what_to_search, 0, 0, 0, 1) ; for each array1, search on array2, not case sensitive If Not @error Then ; if found then $array1_split = StringSplit($Array1[$n], "|") $array2_split = StringSplit($Array2[$search_result_index], "|") ; compare the value of $array1_split and $array2_split and doing something here EndIf Next Your example code doesn't returning $search_result_index so I cant do the compare. I implemented your example code into my code and it works so fast! (136 seconds versus 0.2 seconds), but I think the one that make it slow is _ArraySearch, so without your code get the $search_result_index, I cant really compare them. Thanks again! Edited May 20, 2013 by michaelslamet Link to comment Share on other sites More sharing options...
michaelslamet Posted May 20, 2013 Author Share Posted May 20, 2013 @jchd: thanks for offering another solution! This script should be run on a machine that dont have any database service running, so I dont think it's possible for now. Anyway, does SQLite need to be installed and run as a service? Or we can just put the file(s) in to the same dir with our script and execute the SQLite exe file everytime our script need it? Link to comment Share on other sites More sharing options...
michaelslamet Posted May 20, 2013 Author Share Posted May 20, 2013 @UEZ: thanks for the info and wiki link, this is a new knowledge for me! I read the filehelp that _ArraySort is using QuickSearch Algorithm, is there any faster function to do a search an array on AutoIT? Link to comment Share on other sites More sharing options...
jchd Posted May 20, 2013 Share Posted May 20, 2013 If you don't even have an ODBC (ADO rather) driver on your machine then how do you query/extract remote data and LAN data from both DBs? I must miss something. No, SQLite doesn't need any install/setup/maintainance, just a non-registered DLL which you can very well FileInstall within your script if you need a self-contained solution. If your input data is in fact a couple of CSV-style files (depending on which "standard" they use) then SQLite.exe can import them into a permanent or temporary local SQLite DB and perform queries on it. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
BrewManNH Posted May 20, 2013 Share Posted May 20, 2013 ArrayBinarySearch is probably the fastest method you'll find to search sorted arrays. You can cut the time by 100 times over a non-binary search depending on the data. If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays. - ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script. - Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label. - _FileGetProperty - Retrieve the properties of a file - SciTE Toolbar - A toolbar demo for use with the SciTE editor - GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI. - Latin Square password generator Link to comment Share on other sites More sharing options...
michaelslamet Posted May 20, 2013 Author Share Posted May 20, 2013 I have ODBC installed on the machine But I dont have MySQL installed on the machine (well, i have it on development machine, but not in the production machine), so we cant store the data on the local MySQL database Or you mean store it on the LAN MySQL database? Oh gosh, I miss your point SQLite really seems great, I should take a time to learn it. Thanks jchd! Link to comment Share on other sites More sharing options...
michaelslamet Posted May 20, 2013 Author Share Posted May 20, 2013 @BrewManNH: so you said instead of using _ArraySearch on a non-sorted array, I should sort the array using _ArraySort and doing _ArrayBinarySearch on it? Link to comment Share on other sites More sharing options...
michaelslamet Posted May 20, 2013 Author Share Posted May 20, 2013 _ArrayBinarySearch may be the fastest method to search a sorted array, but what is the fastest method to sort the array? Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now