andygo Posted July 24, 2015 Posted July 24, 2015 hello, i have a script which can handle huge data.i feed it with textfiles, it reads each line into array, then compare the lines, do some string operations. this all happens in a for.to.next loop. the script only use 15% cpu of my 8core.amd can i force the script to use more cpuload an therefore being faster? would it make a speed difference to compile it as 64bit exe on 64bit systems? thank you for commemts :)
mLipok Posted July 24, 2015 Posted July 24, 2015 the script only use 15% cpu of my 8core.amd 1/8 = 12 % 50''AutoIt is not multi process / multi thread Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24
JohnOne Posted July 24, 2015 Posted July 24, 2015 Try...ProcessSetPriority AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans.
water Posted July 24, 2015 Posted July 24, 2015 Do not read the file line by line but either as a whole or by reading bigger chunks into an array and then process the array. shaqan 1 My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
Jewtus Posted July 24, 2015 Posted July 24, 2015 If you can use CSV files, I have had really good efficiency with this script (built by czardas). expandcollapse popup; #FUNCTION# ==================================================================================================================== ; Name...........: _CSVSplit ; Description ...: Converts a string in CSV format to a two dimensional array (see comments) ; Syntax.........: CSVSplit ( $aArray [, $sDelim ] ) ; Parameters ....: $aArray - The array to convert ; $sDelim - Optional - Delimiter set to comma by default (see 2nd comment) ; Return values .: Success - Returns a two dimensional array or a one dimensional array (see 1st comment) ; Failure - Sets @error to: ; |@error = 1 - First parameter is not a valid string ; |@error = 2 - Second parameter is not a valid string ; |@error = 3 - Could not find suitable delimiter replacements ; Author ........: czardas ; Comments ......; Returns a one dimensional array if the input string does not contain the delimiter string ; ; Some CSV formats use semicolon as a delimiter instead of a comma ; ; Set the second parameter to @TAB To convert to TSV ; =============================================================================================================================== Func _CSVSplit($string, $sDelim = ",") ; Parses csv string input and returns a one or two dimensional array If Not IsString($string) Or $string = "" Then Return SetError(1, 0, 0) ; Invalid string If Not IsString($sDelim) Or $sDelim = "" Then Return SetError(2, 0, 0) ; Invalid string $string = StringRegExpReplace($string, "[\r\n]+\z", "") ; [Line Added] Remove training breaks Local $iOverride = 63743, $asDelim[3] ; $asDelim => replacements for comma, new line and double quote For $i = 0 To 2 $asDelim[$i] = __GetSubstitute($string, $iOverride) ; Choose a suitable substitution character If @error Then Return SetError(3, 0, 0) ; String contains too many unsuitable characters Next $iOverride = 0 Local $aArray = StringRegExp($string, '\A[^"]+|("+[^"]+)|"+\z', 3) ; Split string using double quotes delim - largest match $string = "" Local $iBound = UBound($aArray) For $i = 0 To $iBound -1 $iOverride += StringInStr($aArray[$i], '"', 0, -1) ; Increment by the number of adjacent double quotes per element If Mod ($iOverride +2, 2) = 0 Then ; Acts as an on/off switch $aArray[$i] = StringReplace($aArray[$i], $sDelim, $asDelim[0]) ; Replace comma delimeters $aArray[$i] = StringRegExpReplace($aArray[$i], "(\r\n)|[\r\n]", $asDelim[1]) ; Replace new line delimeters EndIf $aArray[$i] = StringReplace($aArray[$i], '""', $asDelim[2]) ; Replace double quote pairs $aArray[$i] = StringReplace($aArray[$i], '"', '') ; Delete enclosing double quotes - not paired $aArray[$i] = StringReplace($aArray[$i], $asDelim[2], '"') ; Reintroduce double quote pairs as single characters $string &= $aArray[$i] ; Rebuild the string, which includes two different delimiters Next $iOverride = 0 $aArray = StringSplit($string, $asDelim[1], 2) ; Split to get rows $iBound = UBound($aArray) Local $aCSV[$iBound][2], $aTemp For $i = 0 To $iBound -1 $aTemp = StringSplit($aArray[$i], $asDelim[0]) ; Split to get row items If Not @error Then If $aTemp[0] > $iOverride Then $iOverride = $aTemp[0] ReDim $aCSV[$iBound][$iOverride] ; Add columns to accomodate more items EndIf EndIf For $j = 1 To $aTemp[0] If StringLen($aTemp[$j]) Then If Not StringRegExp($aTemp[$j], '[^"]') Then ; Field only contains double quotes $aTemp[$j] = StringTrimLeft($aTemp[$j], 1) ; Delete enclosing double quote single char EndIf $aCSV[$i][$j -1] = $aTemp[$j] ; Populate each row EndIf Next Next If $iOverride > 1 Then Return $aCSV ; Multiple Columns Else For $i = 0 To $iBound -1 If StringLen($aArray[$i]) And (Not StringRegExp($aArray[$i], '[^"]')) Then ; Only contains double quotes $aArray[$i] = StringTrimLeft($aArray[$i], 1) ; Delete enclosing double quote single char EndIf Next Return $aArray ; Single column EndIf EndFunc ;==> _CSVSplit ; #INTERNAL_USE_ONLY# =========================================================================================================== ; Name...........: __GetSubstitute ; Description ...: Searches for a character to be used for substitution, ie one not contained within the input string ; Syntax.........: __GetSubstitute($string, ByRef $iCountdown) ; Parameters ....: $string - The string of characters to avoid ; $iCountdown - The first code point to begin checking ; Return values .: Success - Returns a suitable substitution character not found within the first parameter ; Failure - Sets @error to 1 => No substitution character available ; Author ........: czardas ; Comments ......; This function is connected to the function _CSVSplit and was not intended for general use ; $iCountdown is returned ByRef to avoid selecting the same character on subsequent calls to this function ; Initially $iCountown should be passed with a value = 63743 ; =============================================================================================================================== Func __GetSubstitute($string, ByRef $iCountdown) If $iCountdown < 57344 Then Return SetError(1, 0, "") ; Out of options Local $sTestChar For $i = $iCountdown To 57344 Step -1 $sTestChar = ChrW($i) $iCountdown -= 1 If Not StringInStr($string, $sTestChar) Then Return $sTestChar EndIf Next Return SetError(1, 0, "") ; Out of options EndFunc ;==> __GetSubstituteYou could make it multi-threaded though by compiling the script as an exe and then build a script that does a run command on the exe (I've got a script that pulls the top record from my database then marks it as being processed and does work on the record which I run 2-3 of them at a time using a run command)
guinness Posted July 24, 2015 Posted July 24, 2015 @Jewtus, please don't duplicate code around the forum. Instead be polite and post a link back to where you got the original code from.Providing a backlink is better than copying, because no doubt @czardas will update that particular function in the near future. Therefore by linking, you guarantee an up to date function for those who come across your post. Unless you will be updating this post as well with every update they make? czardas 1 UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018
Jewtus Posted July 24, 2015 Posted July 24, 2015 I have tried to find the original link sometime back to no avail. I'm now in the habit of putting the URL in the header when I copy code for that very reason.
water Posted July 24, 2015 Posted July 24, 2015 (edited) This?I found the link this way: Edited July 24, 2015 by water My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
andygo Posted July 25, 2015 Author Posted July 25, 2015 ok, i tried 64 compile and process priority, both didnt change the speed.icleaned my loop to the needed times:$totalloops = 0for x = 1 to array[0] - 1$totalloops += 1nextfor a 3.000 Items array i need (2.999 + 2.998 + 2.997.... + 1) loops and this takes now only 13 seconds.i can deal with that
jchd Posted July 25, 2015 Posted July 25, 2015 Post your code: 13s for 3000 items seems pretty slow. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
andygo Posted July 25, 2015 Author Posted July 25, 2015 (edited) i will post the code today afternoon, still at work now. Edited July 25, 2015 by andygo written some wrong
andygo Posted July 25, 2015 Author Posted July 25, 2015 expandcollapse popup#include <File.au3> #include <StaticConstants.au3> #include <EditConstants.au3> #include <ComboConstants.au3> #include <GuiConstants.au3> #include <WindowsConstants.au3> #include <ButtonConstants.au3> #include <IE.au3> #include <Timers.au3> #include <Process.au3> #Include <Misc.au3> #include <Constants.au3> Break(0) Opt("TrayAutoPause",0) Opt("GuiOnEventMode", 1) if msgbox(65, "Hinweis", "Das Tool wird ggf. von einigen Schutzprogrammen negativ erkannt."& @CRLF & @CRLF &"Ich distanziere mich hiermit ausdrücklich von Schadsoftware"& @CRLF &"in jeglicher Form!"& @CRLF & @CRLF &"©2015 andygo") = 2 then exit global $loopstop = 0, $v = 0, $what, $writecount, $aRecords, $starttime $pre = GuiCreate("EMM.Tool", 290, 150) $laden = GUICtrlCreateButton ("Datei laden", 4, 4, 90, 20) GUICtrlSetOnEvent($laden, "_verarbeiten") $breakup = GUICtrlCreateCheckbox("STOP", 100, 4, 90, 20) GUICtrlSetState($breakup, $GUI_DISABLE) $save = GUICtrlCreateButton ("Datei speichern", 195, 4, 90, 20) GUICtrlSetState($save, $GUI_DISABLE) GUICtrlSetOnEvent($save, "_ausgabe") $inf1 = GUICtrlCreateLabel("", 4, 55, 282, 20, $SS_CENTER) $inf2 = GUICtrlCreateLabel("", 4, 75, 282, 20, $SS_CENTER) $label = GUICtrlCreateLabel("Bereit...", 4, 110, 282, 40, $SS_CENTER) GUICtrlSetFont($label, 16, 500) GUISetOnEvent($GUI_EVENT_CLOSE, "_Quit2") GUISetState(@SW_SHOW) while 1 Sleep(50) wend Func _Quit2() if msgbox(68,"EMM.Tool","Wirklich beenden?") = 6 then Exit EndFunc func addup($a, $b, $c, $d) GUICtrlSetData($label, $what & $v) GUICtrlSetData($inf2, "In: "&$aRecords[0] & " Datensätze ** " & "Out: " &$writecount & " Datensätze") if GUICtrlRead($breakup) = 1 then $loopstop = 1 endfunc func addsave($a, $b, $c, $d) GUICtrlSetData($label, "Schreibe... " & $writecount) endfunc func _verarbeiten() Opt("GuiOnEventMode", 0) GUICtrlSetState($laden, $GUI_DISABLE) GUICtrlSetState($save, $GUI_DISABLE) GUICtrlSetState($breakup, $GUI_UNCHECKED) GUICtrlSetData($label, "Bereit...") GUICtrlSetData($inf1, "") GUICtrlSetData($inf2, "") $infile = FileOpenDialog("Datei", @ScriptDir & "\", "emm-log (*.*)", 1) If @error Then ;MsgBox(4096, "", "No File(s) chosen") Else If Not _FileReadToArray($infile, $aRecords) Then MsgBox(4096, "Fehler", "Fehlerhafte Datei") Else $writecount = 0 $v = StringSplit($infile, "\") GUICtrlSetData($inf1, $infile) GUICtrlSetData($inf2, $aRecords[0] & " Datensätze gelesen") $v = $aRecords[0] $what = "Prüfe... " GUICtrlSetState($breakup, $GUI_enABLE) global $dRecords[$aRecords[0]+1], $eRecords[$aRecords[0]+1], $i, $x, $y, $date _Timer_SetTimer($pre,50,"addup") For $x = 1 To $aRecords[0] if $loopstop = 1 then exitloop $v -= 1 if StringLen ($aRecords[$x]) > 0 Then $bRecords = StringLeft($aRecords[$x], 59) $bRecords = StringReplace($bRecords, StringMid ( $bRecords, 22, 20), " ") $dRecords[$x] = StringReplace($bRecords, 33, "xxxxxx") $bRecords = StringSplit($aRecords[$x], " ", 1) $eRecords[$x] = $bRecords[3] endif Next $v = 0 for $x = 1 to ($aRecords[0] - 1) $v += $x next $what = "Vergleiche... " $i = 1 For $y = 1 To $aRecords[0]-1 if $loopstop = 1 then ExitLoop For $x = $y+1 To $aRecords[0] if $loopstop = 1 then exitloop 2 $v -= 1 if $eRecords[$y] = $eRecords[$x] and StringLen ($eRecords[$y]) > 0 then $i += 1 $eRecords[$x] = "" $date = StringLeft($dRecords[$x], 19) endif next if StringLen($eRecords[$y]) > 0 then $writecount += 1 if $i = 1 then $dRecords[$y] = $dRecords[$y] &" "& $i if $i > 1 then $dRecords[$y] = $dRecords[$y] &" "& $i &" "& $date endif $i = 1 next _Timer_KillAllTimers($pre) GUICtrlSetData($inf2, "In: "&$aRecords[0] & " Datensätze ** " & "Out: " &$writecount & " Datensätze") GUICtrlSetState($breakup, $GUI_DISABLE) if $loopstop = 1 Then GUICtrlSetData($label, "Abbruch durch Benutzer") GUICtrlSetState($breakup, $GUI_UNCHECKED) GUICtrlSetData($inf1, "") GUICtrlSetData($inf2, "") $loopstop = 0 Else GUICtrlSetData($label, "Fertig :-)") GUICtrlSetState($save, $GUI_enABLE) endif endif endif GUICtrlSetState($laden, $GUI_enABLE) Opt("GuiOnEventMode", 1) endfunc func _ausgabe() GUICtrlSetState($save, $GUI_DISABLE) GUICtrlSetState($laden, $GUI_disABLE) Local $file = FileSaveDialog("Datei", @ScriptDir, "emm-log (*.txt)", 2+16) If @error Then GUICtrlSetData($label, "Fertig :-)") Else if StringInStr(StringRight($file, 4), ".") = 0 then $file = $file & ".txt" FileOpen($file, 2) If $file = -1 Then GUICtrlSetData($label, "Fertig :-)") FileClose($file) MsgBox(0, "Fehler", "Speichern nicht möglich.") else _Timer_SetTimer($pre,50,"addsave") GUICtrlSetState($breakup, $GUI_enABLE) FileWriteLine($file, "[code]") For $x = 1 To $aRecords[0] if StringLen($eRecords[$x]) > 0 then FileWriteLine($file, $dRecords[$x]) $writecount -= 1 endif next FileWriteLine($file, "[/code]") FileClose($file) _Timer_KillAllTimers($pre) GUICtrlSetState($breakup, $GUI_DISABLE) if $loopstop = 1 Then GUICtrlSetData($label, "Abbruch durch Benutzer") GUICtrlSetState($breakup, $GUI_UNCHECKED) $loopstop = 0 Else GUICtrlSetData($label, "Datei gespeichert") endif Endif endif GUICtrlSetState($save, $GUI_enABLE) GUICtrlSetState($laden, $GUI_enABLE) endfuncThe main loop-inloop is between line 91 and line 108. this is what takes time. here is an example of 5 lines, how a file looks like to feed the script with. you can create a 3.000 lines file from it.the lines always start with date-time-stamp in always same format. then 3 spaces, then a block of 16 chars, then again 3 spaces.in each line this is always the same count of chars. now it follows a long long string, this could be different long in each line. it is followed always by 3 spaces and then the word 'written' or 'blocked' so, now what my script does: it counts how many times the long long string exists in a file, so each line will be compared with each. it counts the same lines,cut them of a little bit, writes the count behind the cutted string and behind the count, the date-time from the last same string. the output file then looks like this:2015/02/15 04:48:55 8270870102xxxxxx81 1 2015/02/15 04:48:56 8270300102xxxxxx0F 1 2015/02/15 04:49:03 8270870102xxxxxx16 1 2015/02/15 04:49:07 8270870102xxxxxx70 1 2015/02/15 04:49:07 8270200102xxxxxx0F 1 2015/02/15 04:39:57 8270870102xxxxxx81 142 2015/02/20 12:01:20 2015/02/15 04:39:58 8270870102xxxxxx81 144 2015/02/20 11:51:56 2015/02/15 04:39:58 8270310102xxxxxx0F 1 2015/02/15 04:40:05 8270870102xxxxxx81 142 2015/02/20 11:51:57 2015/02/15 04:40:09 8270970102xxxxxx91 77 2015/02/20 11:57:40 inexample.txt
LarsJ Posted July 26, 2015 Posted July 26, 2015 (edited) Sorting the data will make counting much faster. About 10 times faster.expandcollapse popup#include <Array.au3> Opt( "MustDeclareVars", 1 ) Example() Func Example() ; 1. Read file Local $aArray1 = FileReadToArray( "inexample1.txt" ) ;_ArrayDisplay( $aArray1 ) ; 2. Create 2D-array Local $iRows = UBound( $aArray1 ) Local $aArray2[$iRows][4], $aLine For $i = 0 To $iRows - 1 $aLine = StringSplit ( $aArray1[$i], " ", 3 ) ; 3 = $STR_ENTIRESPLIT + $STR_NOCOUNT $aArray2[$i][0] = $aLine[0] ; Col 0 : Time $aArray2[$i][1] = $aLine[1] ; Col 1 : 16 chars $aArray2[$i][2] = $aLine[2] ; Col 2 : Long string $aArray2[$i][3] = $aLine[3] ; Col 3 : blocked/written Next ;_ArrayDisplay( $aArray2 ) ; 3. Index based sort of 2D-array, $aArray2 ($aArray2 is not changed) Local $tIndex = DllStructCreate( "uint[" & $iRows & "]" ) Local $pIndex = DllStructGetPtr( $tIndex ) ; Sort 2D-array by column 2 (long string) ; Sort duplicates by column 0 (time) Local $aCmps[2][3] = [ _ [ 2, 1, +1 ], _ ; Col 2: Compared as strings, asc [ 0, 1, +1 ] ] ; Col 0: Compared as strings, asc SortArray( $aArray2, $pIndex, $tIndex, $aCmps ) ; http://www.autoitscript.com/forum/index.php?showtopic=173279 ; See post 15 ; 4. Extract rows in sorted order Local $aSorted[$iRows][5], $k For $i = 0 To $iRows - 1 $k = DllStructGetData($tIndex,1,$i+1) $aSorted[$i][0] = $aArray2[$k][0] ; Col 0 : Time $aSorted[$i][1] = $aArray2[$k][1] ; Col 1 : 16 chars $aSorted[$i][2] = $aArray2[$k][2] ; Col 2 : Long string $aSorted[$i][3] = $aArray2[$k][3] ; Col 3 : blocked/written $aSorted[$i][4] = $k ; Col 4 : Index in $aArray2 (to display results in file order instead of sort order) Next ;_ArrayDisplay( $aSorted ) ; 5. Count equal strings in col 2 Local $aResults[$iRows][5], $iRes = 0, $iCnt For $i = 0 To $iRows - 2 $iCnt = 1 $aResults[$iRes][0] = $aSorted[$i][0] ; First time $aResults[$iRes][1] = StringLeft( $aSorted[$i][2], 10 ) & "xxxxxx" & StringMid ( $aSorted[$i][2], 17, 2 ) $aResults[$iRes][4] = $aSorted[$i][4] ; File index While $i < $iRows - 1 And $aSorted[$i][2] = $aSorted[$i+1][2] $i += 1 $iCnt += 1 WEnd $aResults[$iRes][2] = $iCnt ; Count $aResults[$iRes][3] = $aSorted[$i][0] ; Last time $iRes += 1 Next ReDim $aResults[$iRes][5] ;_ArrayDisplay( $aResults ) ; If order of results doesn't matter, you can skip step 6. ; and create $aResults2 from $aResults wihtout a sorting. ; 6. Sort results in file order instead of sort order Local $tIndex = DllStructCreate( "uint[" & $iRes & "]" ) Local $pIndex = DllStructGetPtr( $tIndex ) ; Sort 2D-array by column 4 (file index) Local $aCmps[1][3] = [ _ [ 4, 0, +1 ] ] ; Col 4: Compared as numbers, asc SortArray( $aResults, $pIndex, $tIndex, $aCmps ) ; 7. Extract the rows in file order Local $aResults2[$iRes], $k For $i = 0 To $iRes - 1 $k = DllStructGetData($tIndex,1,$i+1) $aResults2[$i] = $aResults[$k][0] & " " & _ ; First time $aResults[$k][1] & " " & _ ; xxxxxx-string StringFormat( "%3d", $aResults[$k][2] ) ; Count If $aResults[$k][2] > 1 Then $aResults2[$i] &= " " & $aResults[$k][3] ; Last time ConsoleWrite( $aResults2[$i] & @CRLF ) Next ;_ArrayDisplay( $aResults2 ) ; 8. Store results Local $s = "[code]" & @CRLF For $i = 0 To $iRes - 1 $s &= $aResults2[$i] & @CRLF Next $s &= "[/code]" & @CRLF FileWrite( "inexample2.txt", $s ) EndFunc Func SortArray( ByRef $aItems, $pIndex, $tIndex, $aCmps ) Local $iCmps = UBound( $aCmps ), $c, $r, $v[$iCmps] Local $lo, $hi, $mi For $i = 0 To UBound( $aItems ) - 1 For $j = 0 To $iCmps - 1 $v[$j] = $aItems[$i][$aCmps[$j][0]] ; Values Next $lo = 0 $hi = $i - 1 While $lo <= $hi ; Binary search $r = 0 ; Compare result (-1,0,1) $j = 0 ; Index in $aCmps array $mi = Int( ( $lo + $hi ) / 2 ) While Not $r And $j < $iCmps ; This While-loop handles sorting by multiple $c = $aCmps[$j][0] ; Column ; columns. Column values of the two rows are Switch $aCmps[$j][1] ; Number/string ; compared until a difference is found. Case 0 ; Compare column values as numbers. The following line is an implementation of the spaceship or three-way comparison operator for numbers like StringCompare is for strings. $r = ( $v[$j] < $aItems[DllStructGetData($tIndex,1,$mi+1)][$c] ? -1 : $v[$j] = $aItems[DllStructGetData($tIndex,1,$mi+1)][$c] ? 0 : 1 ) * $aCmps[$j][2] ; * $iCmpAsc Case 1 ; Compare column values as strings. StringCompare is a spaceship or three-way comparison operator for strings. $r = StringCompare( $v[$j], $aItems[DllStructGetData($tIndex,1,$mi+1)][$c] ) * $aCmps[$j][2] ; * $iCmpAsc EndSwitch $j += 1 WEnd Switch $r Case -1 $hi = $mi - 1 Case 1 $lo = $mi + 1 Case 0 ; Equal ExitLoop EndSwitch WEnd If $i > $mi Then _ ; Make space for the row number in index DllCall( "kernel32.dll", "none", "RtlMoveMemory", "struct*", $pIndex+($mi+1)*4, "struct*", $pIndex+$mi*4, "ulong_ptr", ($i-$mi)*4 ) DllStructSetData( $tIndex, 1, $i, $mi+1+($lo=$mi+1) ) ; Insert row number $i at position $mi+1+($lo=$mi+1) in index Next EndFuncThe zip contains inexample1.txt with 3000 lines and results in inexample2.txt and inexample2-a.txt. Code in tst00.au3 and tst00-a.au3. The a-versions are your versions.Files.7z Edited July 26, 2015 by LarsJ andygo 1 Controls, File Explorer, ROT objects, UI Automation, Windows Message MonitorCompiled code: Accessing AutoIt variables, DotNet.au3 UDF, Using C# and VB codeShell menus: The Context menu, The Favorites menu. Shell related: Control Panel, System Image ListsGraphics related: Rubik's Cube, OpenGL without external libraries, Navigating in an image, Non-rectangular selectionsListView controls: Colors and fonts, Multi-line header, Multi-line items, Checkboxes and icons, Incremental searchListView controls: Virtual ListViews, Editing cells, Data display functions
jchd Posted July 26, 2015 Posted July 26, 2015 (edited) LarsJ,Beware you've a little glitch in grabbing the end of the hex part.Here's a completely different approach, using SQLite:expandcollapse popup#include <SQLite.au3> ;~ #include <SQLite.Dll.au3> ; comment this after installing the SQLite3.dll library _SQLite_Startup() Local $hDB = _SQLite_Open() ; memory DB _SQLite_Exec($hDB, "begin") _SQLite_Exec($hDB, _ "CREATE TABLE andygo (" & _ " Date CHAR, " & _ " Hex CHAR, " & _ " Count INTEGER DEFAULT 1, " & _ " LastDate CHAR DEFAULT '');" & _ "" & _ "CREATE INDEX ixHex ON andygo (Hex);" & _ "" & _ "CREATE TRIGGER trIns " & _ "BEFORE INSERT " & _ "ON andygo " & _ "WHEN exists (select 1 from andygo A where A.hex = new.hex) " & _ "BEGIN " & _ " update andygo set count = count + 1, lastdate = new.date where hex = new.hex;" & _ " select raise(ignore);" & _ "END;" _ ) Local $aIn = StringRegExp(FileRead("inexample1.txt"), "(?m)^(.{19})(?:.{22})([^ ]+)", 3) Local $iRounds = 20 * Int(UBound($aIn) / 20) Local $iRest = Mod(UBound($aIn), 20) For $i = 0 To $iRest - 1 Step 2 _SQLite_Exec($hDB, "insert into andygo (date, hex) values ('" & $aIn[$i] & "','" & $aIn[$i + 1] & "')") Next For $i = $iRest To UBound($aIn) - 1 Step 20 _SQLite_Exec($hDB, "insert into andygo (date, hex) values " & _ "('" & $aIn[$i ] & "','" & $aIn[$i + 1] & "')," & _ "('" & $aIn[$i + 2] & "','" & $aIn[$i + 3] & "')," & _ "('" & $aIn[$i + 4] & "','" & $aIn[$i + 5] & "')," & _ "('" & $aIn[$i + 6] & "','" & $aIn[$i + 7] & "')," & _ "('" & $aIn[$i + 8] & "','" & $aIn[$i + 9] & "')," & _ "('" & $aIn[$i + 10] & "','" & $aIn[$i + 11] & "')," & _ "('" & $aIn[$i + 12] & "','" & $aIn[$i + 13] & "')," & _ "('" & $aIn[$i + 14] & "','" & $aIn[$i + 15] & "')," & _ "('" & $aIn[$i + 16] & "','" & $aIn[$i + 17] & "')," & _ "('" & $aIn[$i + 18] & "','" & $aIn[$i + 19] & "')" _ ) Next _SQLite_Exec($hDB, "commit") Local $aOut, $iRows, $iCols ; you may use the order by clause of your choice or remove the clause if you don't need any sort order _SQLite_GetTable($hDB, "select Date || ' ' || substr(hex, 1, 10) || 'xxxxxx' || substr(hex, -2, 2) || ' '" & _ " || count || ' ' || lastdate from andygo order by count desc, lastdate", $aOut, $iRows, $iCols) _ArrayDelete($aOut, "0-1") ConsoleWrite(_ArrayToString($aOut, @LF) & @LF) _SQLite_Close($hDB) _SQLite_Shutdown()Here, we use a "before insert" trigger to check whether the hex part already exists. If it does, we simply increment the count in the existing row and drop the insertion, else we insert the new row with default count = 1. If the row already exists, we also sore the new date in the lastdate column of the existing row.We also use multiple VALUES per insert statement (here, 20) to speed up insertion. Wrapping the whole insertion block in a transaction also speeds things up. Edited July 26, 2015 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
LarsJ Posted July 26, 2015 Posted July 26, 2015 No jchd, you have a little glitch. The two characters after "xxxxxx" are not the two last characters of the long string, but character no. 17 and 18.No matter what, an SQL-solution is always interesting. And in this case pretty fast too. Too much overhead is often a problem with SQL and small amounts of data. Controls, File Explorer, ROT objects, UI Automation, Windows Message MonitorCompiled code: Accessing AutoIt variables, DotNet.au3 UDF, Using C# and VB codeShell menus: The Context menu, The Favorites menu. Shell related: Control Panel, System Image ListsGraphics related: Rubik's Cube, OpenGL without external libraries, Navigating in an image, Non-rectangular selectionsListView controls: Colors and fonts, Multi-line header, Multi-line items, Checkboxes and icons, Incremental searchListView controls: Virtual ListViews, Editing cells, Data display functions
jchd Posted July 26, 2015 Posted July 26, 2015 Ha, sorry for misinterpreting. Then retrieval goes like this:_SQLite_GetTable($hDB, "select Date || ' ' || substr(hex, 1, 10) || 'xxxxxx' || substr(hex, 17, 2) || ' '" & _ " || count || ' ' || lastdate from andygo -- order by count desc, lastdate", $aOut, $iRows, $iCols)SQLite triggers are a bit slow compared to some other engines (e.g. client/servers) and this shows here. It isn't named "lite" without reason. Yet the speed is still acceptable and I liked the idea of only inserting rows having unique hex strings. Another nice feature is that if the use case requires some odd formatting and/or sorting and/or selection within a given date range, then SQL is quite powerful and efficient. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
andygo Posted July 26, 2015 Author Posted July 26, 2015 LarsJ, this is pretty cool . tried it and works. also with a 25.000 lines file. now i will try to use it with GUI.Feedback follows in 2 days latest.
LarsJ Posted July 26, 2015 Posted July 26, 2015 jchd, I like your idea too: Only inserting rows having unique hex strings.andygo, If you need even more speed because of large files, there is probably room for more optimization by using jchd's idea. Especially if there is a large number of duplicates. The sorting code is just a standard binary sorting. All rows are sorted. Including duplicates. Controls, File Explorer, ROT objects, UI Automation, Windows Message MonitorCompiled code: Accessing AutoIt variables, DotNet.au3 UDF, Using C# and VB codeShell menus: The Context menu, The Favorites menu. Shell related: Control Panel, System Image ListsGraphics related: Rubik's Cube, OpenGL without external libraries, Navigating in an image, Non-rectangular selectionsListView controls: Colors and fonts, Multi-line header, Multi-line items, Checkboxes and icons, Incremental searchListView controls: Virtual ListViews, Editing cells, Data display functions
andygo Posted July 27, 2015 Author Posted July 27, 2015 realisitc input size in most cases is ~ 3.000 to 5.000 lines. to test a 25.000 lines file was a pal's idea just to see the scripts result and time. but i like it and no matter if a user dont recognize timediffernce between yours and jchd sql solution, i will test it too feedback follows.
andygo Posted July 27, 2015 Author Posted July 27, 2015 (edited) Good evening, here are the results with a 25.000 lines input. LarsJ code needs ~ 10 secondsjchd sql code needs ~ 4,5 seconds. special test with a 156.000 lines file (34mb input): solved in 21 seconds. crazy fast Edited July 27, 2015 by andygo additional information
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now