AndyG Posted November 12, 2023 Share Posted November 12, 2023 @Andreik SSE/SIMD Version? Link to comment Share on other sites More sharing options...
Andreik Posted November 12, 2023 Share Posted November 12, 2023 @AndyG I was thinking about but we don't have to move the moon for such a simple task. After all 1ms is already a good speed. RAMzor 1 When the words fail... music speaks. Link to comment Share on other sites More sharing options...
AndyG Posted November 12, 2023 Share Posted November 12, 2023 On 11/10/2023 at 6:58 PM, RAMzor said: Detect tolerance between two files and do this many time in loop. (Actually compare golden pattern file to 300 other in a folder) @Andreik, yes, x-thousand times faster than "native" AutoItcode is not that bad! But there are, independent of the language/compiler you use, much more (possible) improvements. Depending on the size of the files to be compared and the available RAM, you could load the "golden pattern"-file (file to compare with) and then (in a loop) 15 files to compare. Why 15? There are 16 SSE/AVX-register (YMM0-YMM15 in x64-mode) available, each, you know it, 32 bytes wide. So you have to load the "golden pattern" only once and then then you are able to compare 32 byte(s) in the "source" to 32 bytes in each of the 15 loaded files. The loopcount is shortened by a factor of 32, also the number of comparisons by a factor of 15 ( per number of loaded files). Loop unrolling is implemented too Results in (lets forget the overhead and so it is better to calculate in my old brain ) ~30 * ~10 ~~ 300 times faster (comparing to your 27bytes long (I love it^^)) code! Not to mention all the file- and processorcache "goodies" you would benefit from. I'm sure there are many more, and faster, ways to accomplish such a "simple" task..... Link to comment Share on other sites More sharing options...
argumentum Posted November 12, 2023 Share Posted November 12, 2023 ...my 486DX has a math coprocessor. I wander what CPU to emulate in VirtualBox. 🤪 It be good to have an "ASM in AutoIt" tutorial post. It'd be a fun way to learn ASM, I think. Follow the link to my code contribution ( and other things too ). FAQ - Please Read Before Posting. Link to comment Share on other sites More sharing options...
Andreik Posted November 12, 2023 Share Posted November 12, 2023 @AndyG there is no doubt it would be way more faster with SSE/AVX but my point it's that it wouldn't be much a difference since it takes like 5 seconds for AutoIt to create a data structure and copy the content of a file around 500 MB and less than a second for the actual code to deliver the result. @argumentum there were times when coprocessors were a physical different chip on the board but now most CPUs can do their job with x87 FPU instructions. When the words fail... music speaks. Link to comment Share on other sites More sharing options...
RAMzor Posted November 12, 2023 Author Share Posted November 12, 2023 @RTFC After adding _Eigen_ResetLogicalBit0only() the example works as expected! Thanks again for help and sharing so wonderful UDF! @Andreik Thanks, it works perfectly now! Minimalistic, speedy code and does exactly what was required! Link to comment Share on other sites More sharing options...
Andreik Posted November 12, 2023 Share Posted November 12, 2023 For larger files it would be more faster to use WinAPI functions to create and fill the data structures that hold the content of the files. When the words fail... music speaks. Link to comment Share on other sites More sharing options...
argumentum Posted November 12, 2023 Share Posted November 12, 2023 1 hour ago, Andreik said: when coprocessors were a physical different chip on the board but now most CPUs can do their job with x87 FPU instructions. ..I should have said build in, not like the SX (if memory serves). But my points were that if fast enough with an older CPU instruction(s), the better, and that I'd like to be able to do what you programmers do in ASM. A tutorial of sorts would be welcomed ( by me anyways ) to gain some understanding of how is that ASM can run in AutoIt Follow the link to my code contribution ( and other things too ). FAQ - Please Read Before Posting. Link to comment Share on other sites More sharing options...
Andreik Posted November 12, 2023 Share Posted November 12, 2023 (edited) @RAMzor Here is an example of what I said above. expandcollapse popup#AutoIt3Wrapper_UseX64=y #include <Memory.au3> #include <WinAPIFiles.au3> #include <WinAPIHObj.au3> $iMaxDiff = CompareData('TestPtrn_1.raw', 'TestPtrn_2_DEh.raw') ConsoleWrite("MAX DIFF: " & $iMaxDiff & " digital level" & @CRLF) Func CompareData($sFile1, $sFile2) Local $iTimer1 = TimerInit(), $iTimer2 ; Declaring local variables Local $aCall, $dCode, $iCode, $sCode, $pCode, $tCode, $iRead Local $hFile1, $hFile2, $iSize1, $iSize2, $hMemory1, $hMemory2, $pMemory1, $pMemory2 ; Read the files $hFile1 = _WinAPI_CreateFile($sFile1, 2, 2) If @error Then Return SetError(1, 0, Null) $hFile2 = _WinAPI_CreateFile($sFile2, 2, 2) If @error Then _WinAPI_CloseHandle($hFile1) Return SetError(2, 0, Null) EndIf $iSize1 = _WinAPI_GetFileSizeEx($hFile1) $iSize2 = _WinAPI_GetFileSizeEx($hFile2) If $iSize1 <> $iSize2 Then _WinAPI_CloseHandle($hFile1) _WinAPI_CloseHandle($hFile2) Return SetError(3, 0, Null) EndIf $hMemory1 = _MemGlobalAlloc($iSize1, $GMEM_MOVEABLE) $hMemory2 = _MemGlobalAlloc($iSize2, $GMEM_MOVEABLE) $pMemory1 = _MemGlobalLock($hMemory1) $pMemory2 = _MemGlobalLock($hMemory2) _WinAPI_ReadFile($hFile1, $pMemory1, $iSize1, $iRead) _WinAPI_ReadFile($hFile2, $pMemory2, $iSize2, $iRead) _WinAPI_CloseHandle($hFile1) _WinAPI_CloseHandle($hFile2) ; Prepare the code If @AutoItX64 Then $sCode = '0x31C08A440AFF412A4408FF7702F6D838E0720288C4E2EB0FB6C4C3' Else $sCode = '0x8B4C24048B7424088B7C240C31C08A440EFF2A440FFF7702F6D838E0720288C4E2EC0FB6C4C20C00' EndIf $dCode = Binary($sCode) $iCode = BinaryLen($dCode) $pCode = _MemVirtualAlloc(0, $iCode, $MEM_COMMIT, $PAGE_EXECUTE_READWRITE) $tCode = DllStructCreate('byte Code[' & $iCode & ']', $pCode) $tCode.Code = $dCode $iTimer2 = TimerInit() $aCall = DllCallAddress( _ 'int', $pCode, _ ; Pointer to compare function 'int', $iSize1, _ ; Number of bytes 'ptr', $pMemory1, _ ; Pointer to first structure 'ptr', $pMemory2 _ ; Pointer to second structure ) ConsoleWrite('Comparing function run time: ' & TimerDiff($iTimer2) & ' ms' & @CRLF) ; Cleanup _MemVirtualFree($pCode, $iCode, $MEM_DECOMMIT) _MemGlobalUnlock($hMemory1) _MemGlobalUnlock($hMemory2) _MemGlobalFree($hMemory1) _MemGlobalFree($hMemory2) ConsoleWrite('Entire function run time: ' & TimerDiff($iTimer1) & ' ms' & @CRLF) Return $aCall[0] EndFunc The comparing function performance it's 2 files of around 500 MB (a total of ~1GB) / second, and the final function comes with some checks. @argumentum we can have such a thread but there are many things to cover so it would be helpful to be more specific about what things are you looking for. Edited December 7, 2023 by Andreik When the words fail... music speaks. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now