wraithdu Posted October 4, 2011 Share Posted October 4, 2011 (edited) I found I wanted to use this function today, and I wanted to check it out to see how it was written to determine if it could be optimized a bit (I was reasonably sure it could). Most UDF functions are written for a wide use, and narrowing the parameters can lead to performance gains. The reason being that I knew I would be dealing with fairly large arrays (10K+). Well, I really didn't like the way it was done. It created a temp array with _ArrayAdd (poor performance) and converted the input array to strings, losing the original data types on return. At first I rewrote it using arrays, but without the penalty of _ArrayAdd or losing the original data. I got a modest performance gain... actually a lot less than I was suspecting. Then I had an idea and rewrote it using a Scripting.Dictionary object. H o l y c r a p. It is retarded fast now. I wanted to post here to get some opinions and a few more eyes on it, before submitting it to change the library function. The only major difference (aside from retaining the original data types) is that it no longer returns the count in $arr[0]. I thought this was superfluous and requires additional processing if using the array for something else after unique-ing (for example I wanted to shuffle it). This is why we have UBound() anyway. _ArrayUnique expandcollapse popup; #FUNCTION# ==================================================================================================================== ; Name...........: _ArrayUnique ; Description ...: Returns the Unique Elements of a 1-dimensional array. ; Syntax.........: _ArrayUnique($aArray[, $iDimension = 1[, $iIdx = 0[, $iCase = 0[, $iFlags = 1]]]]) ; Parameters ....: $aArray - Input array (1D or 2D only) ; $iDimension - [optional] The dimension of the array to process (only valid for 2D arrays) ; $iIdx - [optional] Index at which to start scanning the input array ; $iCase - [optional] Flag to indicate if string comparisons should be case sensitive ; | 0 - case insensitive ; | 1 - case sensitive ; $iFlags - [optional] Set of flags, added together ; | 1 - Return the array count in element [0] ; Return values .: Success - Returns a 1-dimensional array containing only the unique elements of the input array / dimension ; Failure - Returns 0 and sets @error: ; | 1 - Input is not an array ; | 2 - Arrays greater than 2 dimensions are not supported ; | 3 - $iDimension is out of range ; | 4 - $iIdx is out of range ; Author ........: SmOke_N ; Modified.......: litlmike, Erik Pilsits ; Remarks .......: ; Related .......: _ArrayMax, _ArrayMin ; Link ..........: ; Example .......: Yes ; =============================================================================================================================== Func __ArrayUnique(Const ByRef $aArray, $iDimension = 1, $iIdx = 0, $iCase = 0, $iFlags = 1) ; Check to see if it is valid array If Not IsArray($aArray) Then Return SetError(1, 0, 0) Local $iDims = UBound($aArray, 0) If $iDims > 2 Then Return SetError(2, 0, 0) ; ; checks the given dimension is valid If ($iDimension <= 0) Or (($iDims = 1) And ($iDimension > 1)) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0) ; make $iDimension an array index, note this is ignored for 1D arrays $iDimension -= 1 ; ; check $iIdx If ($iIdx < 0) Or ($iIdx >= UBound($aArray)) Then Return SetError(4, 0, 0) ; ; create dictionary Local $oD = ObjCreate("Scripting.Dictionary") ; compare mode for strings ; 0 = binary, which is case sensitive ; 1 = text, which is case insensitive ; this expression forces either 1 or 0 $oD.CompareMode = Number(Not $iCase) ; Local $vElem ; walk the input array For $i = $iIdx To UBound($aArray) - 1 If $iDims = 1 Then ; 1D array $vElem = $aArray[$i] Else ; 2D array $vElem = $aArray[$i][$iDimension] EndIf ; add key to dictionary ; NOTE: accessing the value (.Item property) of a key that doesn't exist creates the key :) ; keys are guaranteed to be unique $oD.Item($vElem) Next ; ; return the array of unique keys If BitAND($iFlags, 1) = 1 Then Local $aTemp = $oD.Keys() _ArrayInsert($aTemp, 0, $oD.Count) Return $aTemp Else Return $oD.Keys() EndIf EndFunc ;==>__ArrayUnique Example #include <Array.au3> ; i had to cap this at 10000 because the original function is so slow ; my version can do 500000 in about ~6.5 seconds $z = 10000 Local $a[$z] For $i = 0 To $z-1 $a[$i] = Random(0, Int($z/2), 1) Next ; ; don't forget this array has the count returned in $b[0] ; so it will have an extra member $t = TimerInit() $b = _ArrayUnique($a) ConsoleWrite(TimerDiff($t) / 1000 & @CRLF) _ArrayDisplay($b) ; $t = TimerInit() $b = __ArrayUnique($a) ConsoleWrite(TimerDiff($t) / 1000 & @CRLF) _ArrayDisplay($b) ; ; no counter in $b[0] here $t = TimerInit() $b = __ArrayUnique($a, 1, 0, 0, 0) ConsoleWrite(TimerDiff($t) / 1000 & @CRLF) _ArrayDisplay($b) Edited October 6, 2011 by wraithdu Link to comment Share on other sites More sharing options...
UEZ Posted October 4, 2011 Share Posted October 4, 2011 (edited) I had similar idea last year: is using a faster code than the Scripting.Dictionary object method for 1D arrays (thanks to Yashied ). Br,UEZ Edited October 4, 2011 by UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
UEZ Posted October 4, 2011 Share Posted October 4, 2011 (edited) Here a benchmark with your and my version: expandcollapse popup#include <Array.au3> Global $aNames[10] = ["Antonia", "Anton", "Caesar", "Dora", "Emil", "Friedrich", "Gustav", "Heinrich", "Ida", "Julius"] Global $a[1000000] For $I = 0 To Ubound($a) - 1 $r = Random(0, 9, 1) $a[$I] = $aNames[$r] Next ConsoleWrite("Benchmark 1D array:" & @LF) ; no counter in $b[0] here $t = TimerInit() $b = ArrayUnique($a) ConsoleWrite("UEZ: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF) _ArraySort($b) _ArrayDisplay($b) ; no counter in $b[0] here $t = TimerInit() $b = __ArrayUnique($a) ConsoleWrite("wraithdu: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF) _ArraySort($b) _ArrayDisplay($b) Exit ConsoleWrite(@LF & "Benchmark 2D array:" & @LF) Global $aNames[10][2] = [["Antonia", ""], ["Anton", ""], ["Caesar", 300], ["Dora", 24], ["Emil", 33], ["Friedrich", 57], ["Gustav", 53], ["Heinrich", 34], ["Ida", 13], ["Julius", 77]] Global $a[1000000][2] For $I = 0 To Ubound($a) - 1 $r = Random(0, 9, 1) $a[$I][0] = $aNames[$r][0] $a[$I][1] = $aNames[$r][1] Next ; no counter in $b[0] here $t = TimerInit() $b = ArrayUnique($a) ConsoleWrite("UEZ: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF) _ArraySort($b) _ArrayDisplay($b) ; ; no counter in $b[0] here $t = TimerInit() $b = __ArrayUnique($a, 2) ConsoleWrite("wraithdu: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF) _ArraySort($b) _ArrayDisplay($b) ; #FUNCTION# ==================================================================================================================== ; Name...........: _ArrayUnique ; Description ...: Returns the Unique Elements of a 1-dimensional array. ; Syntax.........: _ArrayUnique($aArray[, $iDimension = 1[, $iIdx = 0[, $iCase = 0]]]) ; Parameters ....: $aArray - Input array (1D or 2D only) ; $iDimension - [optional] The dimension of the array to process (only valid for 2D arrays) ; $iIdx - [optional] Index at which to start scanning the input array ; $iCase - [optional] Flag to indicate if string comparisons should be case sensitive ; | 0 - case insensitive ; | 1 - case sensitive ; Return values .: Success - Returns a 1-dimensional array containing only the unique elements of the input array / dimension ; Failure - Returns 0 and sets @error: ; | 1 - Input is not an array ; | 2 - Arrays greater than 2 dimensions are not supported ; | 3 - $iDimension is out of range ; | 4 - $iIdx is out of range ; Author ........: SmOke_N ; Modified.......: litlmike, Erik Pilsits ; Remarks .......: ; Related .......: _ArrayMax, _ArrayMin ; Link ..........: ; Example .......: Yes ; =============================================================================================================================== Func __ArrayUnique($aArray, $iDimension = 1, $iIdx = 0, $iCase = 0) ; Check to see if it is valid array If Not IsArray($aArray) Then Return SetError(1, 0, 0) Local $iDims = UBound($aArray, 0) If $iDims > 2 Then Return SetError(2, 0, 0) ; ; checks the given dimension is valid If ($iDimension <= 0) Or (($iDims = 1) And ($iDimension > 1)) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0) ; make $iDimension an array index, note this is ignored for 1D arrays $iDimension -= 1 ; ; check $iIdx If ($iIdx < 0) Or ($iIdx >= UBound($aArray)) Then Return SetError(4, 0, 0) ; ; create dictionary Local $oD = ObjCreate("Scripting.Dictionary") ; compare mode for strings ; 0 = binary, which is case sensitive ; 1 = text, which is case insensitive ; this expression forces either 1 or 0 $oD.CompareMode = Number(Not $iCase) ; Local $vElem ; walk the input array For $i = $iIdx To UBound($aArray) - 1 If $iDims = 1 Then ; 1D array $vElem = $aArray[$i] Else ; 2D array $vElem = $aArray[$i][$iDimension] EndIf ; add key to dictionary ; NOTE: accessing the value (.Item property) of a key that doesn't exist creates the key :) ; keys are guaranteed to be unique $oD.Item($vElem) Next ; ; return the array of unique keys Return $oD.Keys() EndFunc ;==>__ArrayUnique ; #FUNCTION# ============================================================================ ; Name.............: ArrayUnique ; Description ...: Returns the Unique Elements of a 1-dimensional or 2-dimensional array. ; Syntax...........: _ArrayUnique($aArray[, $iBase = 0, oBase = 0]) ; Parameters ...: $aArray - The Array to use ; $iBase - [optional] Is the input Array 0-base or 1-base index. 0-base by default ; $oBase - [optional] Is the output Array 0-base or 1-base index. 0-base by default ; Return values: Success - Returns a 1-dimensional or 2-dimensional array containing only the unique elements ; Failure - Returns 0 and Sets @Error: ; 0 - No error. ; 1 - Returns 0 if parameter is not an array. ; 2 - Array has more than 2 dimensions ; 3 - Array is already unique ; 4 - when source array is selected as one base but UBound(array) - 1 <> array[0] / array[0][0] ; 5 - Scripting.Dictionary cannot be created for 1D array unique code ; Author .........: UEZ 2010 for 2D-array, Yashied for 1D-array (modified by UEZ) ; Version ........: 0.96 Build 2010-11-20 Beta ; ======================================================================================= Func ArrayUnique($aArray, $iBase = 0, $oBase = 0) If Not IsArray($aArray) Then Return SetError(1, 0, 0) ;not an array If UBound($aArray, 0) > 2 Then Return SetError(2, 0, 0) ;array is greater than a 2D array If UBound($aArray) = $iBase + 1 Then Return SetError(3, 0, $aArray) ;array is already unique because of only 1 element Local $dim = UBound($aArray, 2), $i If $dim Then ;2D array If $iBase And UBound($aArray) - 1 <> $aArray[0][0] Then Return SetError(4, 0, 0) Local $oD = ObjCreate('Scripting.Dictionary') If @error Then Return SetError(5, 0, 0) Local $i, $j, $k = $oBase, $l, $s, $aTmp, $flag, $sSep = Chr(01) Local $aUnique[UBound($aArray)][$dim] If Not $oBase Then $flag = 2 For $i = $iBase To UBound($aArray) - 1 For $j = 0 To $dim - 1 $s &= $aArray[$i][$j] & $sSep Next If Not $oD.Exists($s) And StringLen($s) > 3 Then $oD.Add($s, $i) $aTmp = StringSplit(StringTrimRight($s, 1), $sSep, 2) For $l = 0 To $dim - 1 $aUnique[$k][$l] = $aTmp[$l] Next $k += 1 EndIf $s = "" Next $oD.RemoveAll $oD = "" If $k > 0 Then If $oBase Then $aUnique[0][0] = $k - 1 ReDim $aUnique[$k][$dim] Else ReDim $aUnique[1][$dim] EndIf Else ;1D array If $iBase And UBound($aArray) - 1 <> $aArray[0] Then Return SetError(4, 0, 0) Local $sData = '', $sSep = ChrW(160), $flag For $i = $iBase To UBound($aArray) - 1 If Not IsDeclared($aArray[$i] & '$') Then Assign($aArray[$i] & '$', 0, 1) $sData &= $aArray[$i] & $sSep EndIf Next If Not $oBase Then $flag = 2 Local $aUnique = StringSplit(StringTrimRight($sData, 1), $sSep, $flag) EndIf Return SetError(0, 0, $aUnique) EndFunc ;==>ArrayUnique 1D array: UEZ: 2.3242 seconds wraithdu: 6.3211 seconds What about a 2D array? Your version considers only the rows in a 2D array as far as I can see or? Br, UEZ Edited October 4, 2011 by UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
guinness Posted October 4, 2011 Share Posted October 4, 2011 Thanks wraithdu, I rarely use the Array UDF opting for custom functions I have in my arsenal. Benchmark 1D array:UEZ: 3.2961 secondswraithdu: 16.2358 seconds UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018 Link to comment Share on other sites More sharing options...
JScript Posted October 4, 2011 Share Posted October 4, 2011 @wraithduI also make little use of UDFs arrays, but thanks for sharing! @UEZI had seen that your UDF, very good too! João Carlos. http://forum.autoitbrasil.com/ (AutoIt v3 Brazil!!!) Somewhere Out ThereJames Ingram Download Dropbox - Simplify your life!Your virtual HD wherever you go, anywhere! Link to comment Share on other sites More sharing options...
AZJIO Posted October 4, 2011 Share Posted October 4, 2011 UEZ Try to use "[" Global $aNames[10] = ["Antonia", "Ant[on", "Cae[sar", "Dor[a", "Emil", "Frie[drich", "Gus[tav", "Heinrich", "Ida", "Julius"] Global $a[1000] expandcollapse popup#include <Array.au3> Dim $arr1[5] = [1,2,3,4,2] $a=_ArrayUnique2($arr1) _ArrayDisplay($a, 'Array') Dim $arr1[5] = [4,2,3,4,2] $a=_ArrayUnique2($arr1, 1) _ArrayDisplay($a, 'Array') $a=_ArrayUnique2('er|df|er') _ArrayDisplay($a, 'Array') $a=_ArrayUnique2('er,df,er', ',') _ArrayDisplay($a, 'Array') ; =============================================================================================================================== ; Описание ...: Поиск и удаление дубликатов в данных ; Синтаксис.........: _ArrayUnique2($data[, $flag=-1]) ; Параметр1....: $data - данные, массив или строка с разделителем ; Параметр2 ....: $flag ; Если массив, то $flag является индексом массива от которого производить поиск ; Если строка, то $flag является разделителем, по умолчанию "|" ; Возвращает .: Успешно - массив без дубликатов ; Ошибка - 0 и @error=1 ; Автор ........: AZJIO ; Remarks .......: В данных не должно быть символа "[", такие данные исключаются из массива, даже ; если не являются дубликатами, остальные спец-символы и буквы не вызывают ошибки ; =============================================================================================================================== Func _ArrayUnique2($data, $flag=-1) Local $k, $i, $tmp Assign('/', 1, 1) ;для исключения пустых строк и не совпадения с локальными переменными If IsArray($data) Then If $flag=-1 Then $flag=0 $tmp=UBound($data) -1 If $flag>$tmp Then Return SetError(1, 0, 0) $k=0 For $i = $flag To $tmp Assign($data[$i]&'/', Eval($data[$i]&'/')+1, 1) If Eval($data[$i]&'/') = 1 Then $data[$k]=$data[$i] $k+=1 EndIf Next If $k = 0 Then Return SetError(1, 0, 0) ReDim $data[$k] Return $data Else If $flag=-1 Then $flag='|' $data=StringSplit($data, $flag) If Not @error Then $k=0 For $i = 1 To $data[0] Assign($data[$i]&'/', Eval($data[$i]&'/')+1, 1) If Eval($data[$i]&'/') = 1 Then $data[$k]=$data[$i] $k+=1 EndIf Next If $k = 0 Then Return SetError(1, 0, 0) ReDim $data[$k] Return $data Else Return SetError(1, 0, 0) EndIf EndIf EndFunc My other projects or all Link to comment Share on other sites More sharing options...
wraithdu Posted October 4, 2011 Author Share Posted October 4, 2011 (edited) Here a benchmark with your and my version:So the problem I see with this, as with the current UDF version, is that data type is now lost in the return array, since you are converting everything (implicitly) to a string. Now, I can't 100% defend my implementation either, since it only works with strings and numbers, not pointers, HWND's, or binary types. However your Assign method does give me an idea to modify my first try at a rewrite.What about a 2D array? Your version considers only the rows in a 2D array as far as I can see or?I was simply copying the existing functionality for the 2D arrays. I don't care for it per say, but I didn't have a reason or suggestion to modify it to operate differently. Edited October 4, 2011 by wraithdu Link to comment Share on other sites More sharing options...
wraithdu Posted October 4, 2011 Author Share Posted October 4, 2011 I had similar idea last year: is using a faster code than the Scripting.Dictionary object method for 1D arrays (thanks to Yashied ).I there a reason you're using Assign for 1D arrays and not 2D? Link to comment Share on other sites More sharing options...
wraithdu Posted October 4, 2011 Author Share Posted October 4, 2011 (edited) Assign is interesting... and very fast. Here's an adaptation of my first idea, which keeps the original data types.Func __ArrayUnique3($aArray, $iDimension = 1, $iIdx = 0, $iCase = 0) ; Check to see if it is valid array If Not IsArray($aArray) Then Return SetError(1, 0, 0) Local $iDims = UBound($aArray, 0) If $iDims > 2 Then Return SetError(2, 0, 0) ; ; checks the given dimension is valid If ($iDimension <= 0) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0) ; make $iDimension an array index, note this is ignored for 1D arrays $iDimension -= 1 ; create return array and element counter Local $aReturn[UBound($aArray)], $iUnique = 0, $vElem ; walk the input array For $i = $iIdx To UBound($aArray) - 1 If $iDims = 1 Then ; 1D array $vElem = $aArray[$i] ElseIf $iDims = 2 Then ; 2D array $vElem = $aArray[$i][$iDimension] EndIf ; search the return array for the next value, add it if not found If Not IsDeclared($vElem & "$") Then Assign($vElem & "$", 0, 1) $aReturn[$iUnique] = $vElem $iUnique += 1 EndIf Next ; ; redim the output array ReDim $aReturn[$iUnique] Return $aReturn EndFunc ;==>__ArrayUnique3The question here is if a string representation of any data type is guaranteed to be unique. The answer unfortunately is no. Both numeric 0 and string "0" are the same when compared as strings, thus the Assign call will not see the difference. Whereas the scripting dictionary does make that distinction. Another wrinkle is case sensitivity. Assign / IsDeclared seem to be case insensitive. This method is great though when your entire array is guaranteed to be the same data type (which, let's face it, should really always be the case) and you don't care about case sensitivity. It also works for some of the other AutoIt data types that have conversions to strings, such as Ptr, Hwnd, and Binary. Edited October 4, 2011 by wraithdu Link to comment Share on other sites More sharing options...
money Posted October 4, 2011 Share Posted October 4, 2011 (edited) Work around local $sVar $sVar = Hex(StringToBinary($aArray[$i])) If Not IsDeclared($sVar & '$') Then Assign($sVar & '$', 0, 1) Func _ArrayUniqueFast(ByRef Const $aArray, ByRef $aUnique, $bCaseSensitive = True) ;author: Yashied taken from http://www.autoitscript.com/forum/topic/122192-arraysort-and-eliminate-duplicates/page__p__848191#entry848191 ; fixes invalid characters/case sensitivy issues Local $sData = '', $sSep = ChrW(160), $sVar For $i = 0 To UBound($aArray) - 1 If $bCaseSensitive Then $sVar = Hex(StringToBinary(StringLower($aArray[$i]))) Else $sVar = Hex(StringToBinary($aArray[$i])) EndIf If Not IsDeclared($sVar) Then Assign($sVar, 0, 1) $sData &= $aArray[$i] & $sSep EndIf Next $aUnique = StringSplit(StringTrimRight($sData, 1), $sSep) EndFunc ;==>_ArrayUniqueFast Edit: Add case sensitivity flag Edited October 4, 2011 by money Link to comment Share on other sites More sharing options...
UEZ Posted October 4, 2011 Share Posted October 4, 2011 (edited) @wraithdu: you are right - I forgot to mention the limitations of my (Yashied's) version (long time ago when I played with ArrayUnique())! Br, UEZ Edited October 4, 2011 by UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
UEZ Posted October 4, 2011 Share Posted October 4, 2011 UEZ Try to use "[" Global $aNames[10] = ["Antonia", "Ant[on", "Cae[sar", "Dor[a", "Emil", "Frie[drich", "Gus[tav", "Heinrich", "Ida", "Julius"] Global $a[1000] expandcollapse popup#include <Array.au3> Dim $arr1[5] = [1,2,3,4,2] $a=_ArrayUnique2($arr1) _ArrayDisplay($a, 'Array') Dim $arr1[5] = [4,2,3,4,2] $a=_ArrayUnique2($arr1, 1) _ArrayDisplay($a, 'Array') $a=_ArrayUnique2('er|df|er') _ArrayDisplay($a, 'Array') $a=_ArrayUnique2('er,df,er', ',') _ArrayDisplay($a, 'Array') ; =============================================================================================================================== ; Описание ...: Поиск и удаление дубликатов в данных ; Синтаксис.........: _ArrayUnique2($data[, $flag=-1]) ; Параметр1....: $data - данные, массив или строка с разделителем ; Параметр2 ....: $flag ; Если массив, то $flag является индексом массива от которого производить поиск ; Если строка, то $flag является разделителем, по умолчанию "|" ; Возвращает .: Успешно - массив без дубликатов ; Ошибка - 0 и @error=1 ; Автор ........: AZJIO ; Remarks .......: В данных не должно быть символа "[", такие данные исключаются из массива, даже ; если не являются дубликатами, остальные спец-символы и буквы не вызывают ошибки ; =============================================================================================================================== Func _ArrayUnique2($data, $flag=-1) Local $k, $i, $tmp Assign('/', 1, 1) ;для исключения пустых строк и не совпадения с локальными переменными If IsArray($data) Then If $flag=-1 Then $flag=0 $tmp=UBound($data) -1 If $flag>$tmp Then Return SetError(1, 0, 0) $k=0 For $i = $flag To $tmp Assign($data[$i]&'/', Eval($data[$i]&'/')+1, 1) If Eval($data[$i]&'/') = 1 Then $data[$k]=$data[$i] $k+=1 EndIf Next If $k = 0 Then Return SetError(1, 0, 0) ReDim $data[$k] Return $data Else If $flag=-1 Then $flag='|' $data=StringSplit($data, $flag) If Not @error Then $k=0 For $i = 1 To $data[0] Assign($data[$i]&'/', Eval($data[$i]&'/')+1, 1) If Eval($data[$i]&'/') = 1 Then $data[$k]=$data[$i] $k+=1 EndIf Next If $k = 0 Then Return SetError(1, 0, 0) ReDim $data[$k] Return $data Else Return SetError(1, 0, 0) EndIf EndIf EndFunc Yes, it will not find any duplicates when I use the strings above. Thanks for the hint! Br, UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
UEZ Posted October 4, 2011 Share Posted October 4, 2011 I there a reason you're using Assign for 1D arrays and not 2D?No, my 1st version was also a version with Scripting.Dictionary object method and extented it to 2D array version. Finally Yashied showed still a faster version and I added only the 1D array version.When I find some time I will add the Assign version also for 2D arrays.Br,UEZ Please don't send me any personal message and ask for support! I will not reply! Selection of finest graphical examples at Codepen.io The own fart smells best! ✌Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!¯\_(ツ)_/¯ ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ Link to comment Share on other sites More sharing options...
Spiff59 Posted October 5, 2011 Share Posted October 5, 2011 (edited) The question here is if a string representation of any data type is guaranteed to be unique. The answer unfortunately is no. Both numeric 0 and string "0" are the same when compared as strings, thus the Assign call will not see the difference. One could test the data type and tack a variable suffix onto the end of the Assign() variable. So a numeric 12 would assign a variable named "12$i", a string "12" would generate "12$s", h for handle, etc. Without testing, I can't say how much the extra processing would detract from the speed benefit of the Assign() method. Edit: This discriminates between numeric and string, leaves the output unmodified, and seems to enjoy the speed of Yashieds Assign() method: expandcollapse popup#include <Array.au3> Global $array[13] = ["44", "Messerschmidt", 22, "Dornier", 33, "Heinkel", "Focke-Wulf", "Junkers", "22", "Arado", 33, "Henschel", "44"] $array = __ArrayUnique4($array) _ArrayDisplay($array) Global $array[8][2] = [["Paul", 22],["Mike", 33],["Dave", 44],["Bill", 22],["Fred", 66],["Carl", 77],["Luke", 33],["John", 22]] $array = __ArrayUnique4($array, 2) _ArrayDisplay($array) Func __ArrayUnique4($aArray, $iTargetDim = 1, $iBase = 0, $iCase = 0) If Not IsArray($aArray) Then Return SetError(1, 0, 0) Local $iDims = UBound($aArray, 0) If $iDims > 2 Then Return SetError(2, 0, 0) If ($iTargetDim < 1) Or ($iTargetDim > $iDims) Then Return SetError(3, 0, 0) Local $iDim1 = UBound($aArray, 1), $iUnique = 0, $vElem If $iDims = 2 Then Local $iDim2 = UBound($aArray, 2), $aReturn[$iDim1][$iDim2], $j $iTargetDim -= 1 For $i = $iBase To $iDim1 - 1 $vElem = $aArray[$i][$iTargetDim] & "$" & IsNumber($aArray[$i][$iTargetDim]) If Not IsDeclared($vElem) Then Assign($vElem, 0, 1) For $j = 0 to $iDim2 - 1 $aReturn[$iUnique][$j] = $aArray[$i][$j] Next $iUnique += 1 EndIf Next ReDim $aReturn[$iUnique][$iDim2] Else Local $aReturn[$iDim1] For $i = $iBase To $iDim1 - 1 $vElem = $aArray[$i] & "$" & IsNumber($aArray[$i]) If Not IsDeclared($vElem) Then Assign($vElem, 0, 1) $aReturn[$iUnique] = $aArray[$i] $iUnique += 1 EndIf Next ReDim $aReturn[$iUnique] EndIf Return $aReturn EndFunc ;==>__ArrayUnique4 It could probably still stand something like Money's mod to handle special chars and case. Edited October 5, 2011 by Spiff59 Link to comment Share on other sites More sharing options...
Spiff59 Posted October 5, 2011 Share Posted October 5, 2011 (edited) The Assign/IsDeclared method does continue to have a bit of a foul smell to me, as I'm sure it's something any instructor or manager I've ever had would shoot down as non-standard trickery that has no place in production. To comment directly on the first post. Your compound edit for error condition 3 seems odd... "And $iDimension > UBound($aArray, 2)"?Wouldn't "If ($iDimension < 1) Or ($iDimension > $iDims) Then Return SetError(3, 0, 0)" be sufficient? Is no Scripting.Dictionary/object clean-up necessary? Like in the 2 lines in the examples in UEZ's thread from last year?$oD.RemoveAll $oD = "" And, I think whether you include the array count in the first element or not makes little difference to anyone, except that in changing it to not do so makes your version a script-breaker, which I would think you would want to avoid. Edit: remove garbage inside the codetags... Edited October 5, 2011 by Spiff59 Link to comment Share on other sites More sharing options...
wraithdu Posted October 5, 2011 Author Share Posted October 5, 2011 (edited) To comment directly on the first post. Your compound edit for error condition 3 seems odd... "And $iDimension > UBound($aArray, 2)"? Wouldn't "If ($iDimension < 1) Or ($iDimension > $iDims) Then Return SetError(3, 0, 0)" be sufficient?$iDimension is I guess a poor choice of variable name (again, copied from the original). It really refers to the index in the 2nd dimension of a 2D array, from which you want to pull data. So while $iDims (the number of array dimensions, 1D, 2D, 3D...) cannot be greater than 2, $iDimension can be anything <= the number of 'columns' in the 2nd dimension. Is no Scripting.Dictionary/object clean-up necessary? Like in the 2 lines in the examples in UEZ's thread from last year?Nope. Objects are automatically cleaned up when they go out of scope. And, I think whether you include the array count in the first element or not makes little difference to anyone, except that in changing it to not do so makes your version a script-breaker, which I would think you would want to avoid.While I would usually agree, I really hate this counter return. I wrote this update originally because I wanted to unique the array then shuffle it. Having to get rid of the counter before shuffling is just annoying. Plus, if I pass an array to a function that returns a modified version of my array, I expect no extraneous data. StringSplit is a little different (and I'm glad you have the option of the counter return) since you submit a string and get an array back. If this goes so far as to be submitted, we can revisit this as an option I suppose... but there would be an expensive ReDim somewhere to get rid of or add that [0] array element. Maybe to support backwards compatibility have the counter returned as the default option, but in the code it requires a call to _ArrayInsert (so sensible people like me aren't penalized ). Edited October 5, 2011 by wraithdu Link to comment Share on other sites More sharing options...
Spiff59 Posted October 5, 2011 Share Posted October 5, 2011 (edited) $iDimension is...Oops, got my own wires crossed on the parm edit. Looks like you'd have to revert to UEZ's use of the Exists method in order to keep a counter and return the array count in element 0: expandcollapse popupFunc __ArrayUnique($aArray, $iDimension = 1, $iIdx = 0, $iCase = 0) ; Check to see if it is valid array If Not IsArray($aArray) Then Return SetError(1, 0, 0) Local $iDims = UBound($aArray, 0) If $iDims > 2 Then Return SetError(2, 0, 0) ; ; checks the given dimension is valid If ($iDimension <= 0) Or (($iDims = 1) And ($iDimension > 1)) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0) ; make $iDimension an array index, note this is ignored for 1D arrays $iDimension -= 1 ; ; check $iIdx If ($iIdx < 0) Or ($iIdx >= UBound($aArray)) Then Return SetError(4, 0, 0) ; ; create dictionary Local $oD = ObjCreate("Scripting.Dictionary") ; compare mode for strings ; 0 = binary, which is case sensitive ; 1 = text, which is case insensitive ; this expression forces either 1 or 0 $oD.CompareMode = Number(Not $iCase) ; Local $vElem, $iUnique $oD.Item(Chr(0)) ; walk the input array For $i = $iIdx To UBound($aArray) - 1 If $iDims = 1 Then ; 1D array $vElem = $aArray[$i] Else ; 2D array $vElem = $aArray[$i][$iDimension] EndIf If Not $od.Exists($vElem) Then $oD.Item($vElem) $iUnique += 1 EndIf Next $oD.Key(Chr(0)) = $iUnique ; ; return the array of unique keys Return $oD.Keys() EndFunc ;==>__ArrayUnique Edit: I do have to say that Assign() trick IS blazingly fast and it looks like the kinks could be worked out of it (post #14 does handle data types). But would the Dev's consider it misuse of the language... And I still think that edit looks fishy lol Edited October 5, 2011 by Spiff59 Link to comment Share on other sites More sharing options...
wraithdu Posted October 5, 2011 Author Share Posted October 5, 2011 (edited) Looks like you'd have to revert to UEZ's use of the Exists method in order to keep a counter and return the array count in element 0:Nope, just assign the return array from the dictionary ($oD.Keys) to a temp variable, do a UBound() to get the size, insert it as element [0], and return the temp var. I agree about the Assign method... it is fast but definitely an abuse of the language. The dictionary method is not all that slower considering the prodigious array size you have to use to see it (2s vs 6s @ 1mil elements in UEZ's test). The only shortcoming to the dictionary method I see is it only works for strings and numbers. There's surely some ugly workarounds for Ptr/Hwnd and Binary types involving VarGetType() and Number() / String(), but there's a performance penalty there as well and I don't like the potential data mangling. Edit: What's this all about? $oD.Key(Chr(0)) = $iUnique Can that really guarantee placement at the top of the return array? Edited October 5, 2011 by wraithdu Link to comment Share on other sites More sharing options...
Spiff59 Posted October 5, 2011 Share Posted October 5, 2011 Can that really guarantee placement at the top of the return array?Well, it certainly appears to be FIFO, so I think it ought to work fine. Whether a single nul character is unique enough to avoid conflicts with actual data passed in an array could be debated. Link to comment Share on other sites More sharing options...
wraithdu Posted October 6, 2011 Author Share Posted October 6, 2011 That would be LIFO, actually. But regardless irrelevant. After looking at it again, you're storing the count as a value, and my function returns .Keys(). There's obviously no way to guarantee that the count is a unique key, so we're left with storing the array as a temp var, and inserting the count before return. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now