Jump to content

jvanegmond

MVPs
  • Posts

    10,648
  • Joined

  • Last visited

  • Days Won

    13

jvanegmond last won the day on October 9 2015

jvanegmond had the most liked content!

4 Followers

About jvanegmond

Profile Information

  • Member Title
         
  • WWW
    http://www.josvanegmond.nl

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

jvanegmond's Achievements

  1. Your gist understanding is correct. Except when the array is too large to fit, for example it spans 1 contiguous block of 100MB, this array will just be read into the CPU cache at page-size blocks at a time based on the subscripts you are accessing. This is why sequentially accessing an array is faster than randomly accessing it. The data for the next element will already be in cache. The test which spudw2k did unfortunately has some problems which make interpreting any meaningful result from it difficult. The array is reinitialized and then accessed immediately after, which can leave the array in the cache. But this depends on CPU make & model. Random numbers are generated in a non-deterministic way, which can affect the outcome of each new test. 10 samples taken is not nearly enough to eliminate random noise, from other programs and the operating system. Probably TimerInit/TimerDiff should be replaced for all such performance measurement scenarios with a straight up call to a QueryPerformanceCounter function, but then DllCall is kinda slow too so it's a lesser of evils decision. There are more problems, none are easy to really properly compensate for. AutoIt is in this sense not designed with such small performance thoughts in mind. It's better to focus on things we know which will have an affect on performance. Having the data already in cache is referred to as locality of reference and can help in a big way in your initial example. It shouldn't really matter if you copied it into a temporary variable or are accessing it directly, it matters that its in cache or no.
  2. Triblade, what an interesting question. Under the waters, any array is just an address of where to start finding the values in memory. Say you're storing 32-bit integers in an array, this will be stored in a contiguous block of memory. If we were to look at the memory, it would look like: base address (start) and then 4 bytes integer, 4 bytes integer, 4 bytes integer, and so on. If the base address of this array is 50, array index 2 is going to be at 50 + (2 x 4) where 4 is the size in bytes of the integer. In statically typed languages, the number of bytes for your value is determined at compile time so this can be 'baked' into the program to avoid any calculations. In AutoIt, a dynamically typed language, there exists something called a variant (based on 2005 source but probably not changed) which can store any type of data: Int, long, double, window handle, and such small data types easily fit in the variant structure. These are stored sequentially for optimization reasons with a minor overhead. Large values such as strings are stored in another location and the variant simply holds the address to this string memory, so this types of data are exceptional and should be considered carefully when thinking about performance. Multidimensional arrays get turned into flat arrays, where the indexes are multiplied by one another. So $closed[3][4] is just $closed base address + ((3 * 4) * 4). Again taking 4 bytes for each integer. So you can consider $closed[3][4] to be another syntax for $closed[3*4]. This is really not a special case for the underlying language. The CPU is incredibly fast. He always copies data from RAM to cache in sizes which are called a block. This is mostly 4K of memory at once copied from RAM, which take about 120+ CPU cycles to get into cache. Modern CPU have 3 levels of cache, called L3, L2 and L1. Depending on manufacturer, these caches have different sizes and this is a large contributor into what makes some modern CPU's feel fast and some slow apart from just raw clock speed. Accessing arrays requires the CPU to multiply the indexes and then get the whole block of memory into cache. Multiplying the dimensions for your array in your script, 0 by 0, is trivial. This takes only 1 or 2 CPU cycles. Actually retrieving the memory from RAM takes 120+ cycles. Once this memory is in cache, it will take less than 20 CPU cycles to retrieve it again. How do we know if memory is still in cache? This depends on a lot of factors, but the main factor here is your operating system. Your operating system slices the total CPU time up into what are called slices. Each thread on the operating system gets a time slice according to some prioritization mechanism. Familiarity with this might include opening up task manager, seeing the total amount of threads on the system, and changing the priority of a process to High or Realtime. Now do not be alarmed at the 2000+ something threads running on your system. A vast majority of these threads are in sleep mode, and the scheduler simply skips over them. Our program will be executing within one of these time slices. This means at the start of the time slice, our CPU will get RAM into cache and will as quickly as possible execute our program. The realistic expected time frame of accesses to the same point in memory should be around 120 cycles for the first access and 20 cycles afterwards. As long as we write our code in tight loops, where we are not voluntarily relinquishing CPU time back to the operating system when we are waiting for something, our data should still be in cache. Realistically, data still being in cache is determined by a cache eviction policy/algorithm and there is really nothing you can do about this other than not submit to thread switches and perhaps write a nice letter to Intel/AMD where you offer a couple of beers and to sacrifice a goat. Each vendor individually has their own cache eviction policy and I believe in x86 or x86-64 these are not standardized and therefore make for a source of competition for these companies.(Citation needed) Strings and large memory structures are special. In this case, the variant does not hold the data directly but holds a reference to where the data is really stored. This means the CPU must get the array from memory, which points to other blocks of memory which must also be gotten from RAM. This means not only one of those terrible 120+ cycle accesses, but several, just to access one item. There is really no good solution for this with arrays. Other data types which allow elements to be of non-uniform length might be better suited for the application. This is more or less the same for statically typed languages. THIS COVERS ARRAY ACCESS. Now keep in mind that code is just another form of data. The code must also be read into CPU cache before it can be executed. If the AutoIt language makes some bad choices regarding caches; this will severely impact performance and may invalidate some or all of the above. For example, code pointing to other code, to other code, which must all be gotten from cache in sequence is going to make a practical program - which follows all the correct performance guidelines - very, very slow. This is why the community prefers to do benchmarks - but there's nothing wrong with a little theory from time to time. I believe AutoIt to do this mostly pretty OK in practice. The AutoIt source from 2005 seems to confirm this and that's really the best I can do on this part. Disclaimer: I've had to dumb this down, and only somewhat reflects the intricacies of modern computing.
  3. If you need to pick absolutely the most performant technology for the job, the answer is neither. Weigh the cost of the additional development time versus the gains of a few seconds of improved runtime of the program and draw your conclusion on that. A rough estimate; they're both dynamically typed, run-time interpreted languages. You're going to be measuring differences within the same order of magnitude. Why not Powershell?
  4. I feel like that should be part of the library - but it currently is not. You should be able to create a new log file based on size by running this occasional: If FileSize(logfile) > SomeLimit Then LogClose() Zip or FileMove -or- FileDelete LogFile(logfile) EndIf You might not want to immediately delete your last log, so you can make a simple log rotation going by: FileDelete("log3.log") FileMove("log2.log", "log3.log") FileMove("log1.log", "log2.log") FileMove("log.log", "log1.log") Or by going a step further and putting the date in the file name.
  5. I can't stomach going through three pages of "Can not"'s and "Must not"'s so I will skip that bit. @OP, AutoIt developers are not mean and evil people who will never release source code. If you can build a working compiler for AutoIt3 with some example methods working (for example MsgBox, ConsoleWrite, Call, Eval, etc.) and good unit tests (run au3 scripts and check output) for all the weird syntax that is 90% of the work required to compile AutoIt3. The real work is not getting the C++ source code of the AutoIt functions, but getting AutoIt3 scripts compiled. Not to send you home empty handed: Building compilers is hard. If you are going to build a compiler, don't start with AutoIt. First get good writing compilers. Only then pick one of the weirdest dynamic languages in the world to write a compiler for. PHP is very much like AutoIt in its dynamic capabilties. Facebook took a subset of PHP and built a compiler for it. It took hundreds of man-years of work and ultimately the project was superceded by a VM (as in, abstraction of a microprocessor).
  6. Maybe nitpicking but, INC is not atomic in x86 and you missed bad branch predictions as a leading cause of performance woes, which is arguably the most important next to caching behavior. @JohnOne you may enjoy http://www.nand2tetris.org/ . I can highly recommend it. Skip the stuff you already know and get programming. You can even do it in AutoIt if you are uncomfortable doing the instructions as suggested.
  7. This part is null: (Get-WmiObject Win32_Product | Where Name -eq "*PowerDirector*")So the call to the uninstall method will not work. It's like trying to do this: ($null).uninstall()So PowerShell is telling you, it is literally impossible to call a member function of something that is not. It doesn't even get to the variable assignment part because evaluating the expression on the right hand side failed, so what value should it put? I think the reason it's not "clicking" for you is because it is all written on a single line and you're used to AutoIt where everything is written out on many lines. So you're confused about the order in which it is run. Try writing the same thing as you would write it in AutoIt.
  8. I can confirm that the reproducer provided by ferbfletcher does indeed have a memory leak, but only when used with $STDERR_MERGED. We probably should treat this as a bug from here on out so I have created ticket https://www.autoitscript.com/trac/autoit/ticket/3135 You can try to run the following simplified code and observe the memory usage: #include <Constants.au3> While 1 $pid = Run("ping localhost","", @SW_HIDE, $STDERR_MERGED) ProcessWait($pid) StdioClose($pid) ProcessClose($pid) ProcessWaitClose($pid) WEnd@ferbfletcher Many thanks for taking the time to provide a reproduction script to get to the bottom of this. It is very much appreciated and we'll take it from here, assuming your problem is solved adequately.
  9. Don't use lazy, use not. For example to get stuff between quotes use: "[^"]+" This means: Match a quote followed by anything that is not a quote a number of times followed by a quote.
  10. Forgot to escape the +, but yeah, numbers above 1000 do not start with 0.
  11. StdoutRead has a limit to the maximum number of characters returned. Suppose this limit is 2048 hypothetically (I don't know what it is really but 2048 seems reasonable and is found throughout AutoIt), and we only run StdoutRead at a maximum rate of 100x per second because of the 10ms sleep in between, that gives us a maximum of 2048*100 bytes per second which is ~205kb/s. At 2 billion characters per hour in the input stream, that's roughly 4800kb/s. So taking that into account, your standard stream is probably filling up around 20-25 times faster than you are reading from it, which explains the insane memory usage. What happens if you trigger the scenario you are trying to prevent, where the process does not produce output? Don't just force close the process, that might cause AutoIt to discard some of the buffer and causes StdoutRead to return empty. What happens to memory in that time? Or does memory usage gradually decrease and thén it triggers the forced close? You probably should have written something like this in the first place: $pid = Run("mycommand.exe","",@SW_HIDE,$STDERR_MERGED) ;Read from both the stdout and the stderr $lastData = TimerInit() Sleep(100) ; Wait for process to start producing output AdlibRegister("CheckLastStreamOutput", 10) While 1 If StringLen(StdoutRead($pid)) == 0 Then Sleep(1) EndIf $lastData = TimerInit() Wend Func CheckLastStreamOutput() If TimerDiff($lastData) > 10 Then ProcessClose($pid) EndIf EndFuncBut in any case your change to reduce the input by 99% probably already fixes the issue.
  12. I made a little reproducer but couldn't get AutoIt (3.3.14.2) process memory above 3 megabytes after a run of 30 minutes, so a short test says it is unlikely to be the problem. However, you never know what happens if you run something for a longer period of time. 25 megabytes total for an AutoIt process seems a bit much in the first place though. What else is that script doing? The code, as you posted it, kills the process if it has not produced any output for a period of 10 milliseconds. If you remove STDERR_MERGED, the process you are running produces less data in STDOUT because it no longer contains the error stream. This can result in the process being killed more often. If the memory leak is in the part of cleaning up the process or restarting it (do you have such a mechanism?) that can influence the results (though you'd expect the opposite behavior). I would focus my efforts there, first upgrading AutoIt (just in case!) and then for example by making a test where you continuously stop and start the process, since a short test says that StdoutRead is probably not leaking memory.
  13. If you want you can include .csproj and .sln files too but those are not strictly necessary. We should be able to get it to run from .cs files alone. Maybe some .resx or something too.
  14. Since no one else can reproduce this, can you provide a small reproducer? I think a lot of people are very curiously reading this thread and are dying for one. "OP please deliver"
×
×
  • Create New...