Jump to content

Leaderboard

Popular Content

Showing content with the highest reputation on 04/13/2018 in all areas

  1. Ok, let's go. First, notice the difference between three distinct things: a character set (assignment of a numeric value [a codepoint] to a character name), the representation of thoses characters in memory or in transit on a network (the encoding) and the rendering of a given character (its visual aspect, or final rendered glyph) in a given font. AutoIt currently uses character set UCS2, a subset of the full Unicode standard, where every character is represented by a single 16-bit coding unit. This encoding gives access to what is called the Unicode BMP (Basic Multilingual Plane) of 64K characters, enough for most purposes in common practice. Full Unicode can use several encodings: UTF8 (multibyte encoding, coding unit = one byte, some characters need 4 bytes for representing) UTF16-LE (multi-word encoding, coding unit = one 16-bit word in little endian representation, some characters need 2 encoding units) UTF16-BE (multi-word encoding, coding unit = one 16-bit word in big endian representation, some characters need 2 encoding units) UTF32-LE (multi-word encoding, coding unit = one 32-bit word in little endian representation, all characters need 1 encoding unit) UTF32-BE (multi-word encoding, coding unit = one 32-bit word in big endian representation, all characters need 1 encoding unit). Native AutoIt strings use UCS2 in UTF16-LE encoding (limited to one single 16-bit word per character). Windows has been using the full Unicode range in UTF16-LE encoding natively for a long time. Yet characters above the BMP (whose codepoints are greater then 0xFFFF) are rather rare and difficult to display as there exist only very little fonts capable of doing so, and no known font cover the full Unicode (far from that). Of course most users don't need to display daily aegyptian hieroglyphs, dominoes, mahjong tiles or antique musical symbols and such things like that. Refer to https://en.wikipedia.org/wiki/Plane_(Unicode) to have a helicopter view of what is in the various Unicode planes. You'll see that characters in the sole BMP already cover a large range of use cases worldwide. That's why even with the limitation introduced by UCS2 (limited to the BMP) you can process a wide variety of texts. This is why I said "well, almost". That "almost" is good enough for 99% of users. A very good site centered on Unicode is https://r12a.github.io/scripts/tutorial/index by one of early Unicode design engineers. Don't miss anythings, there a lots ot things to learn/discover about human languages and cultures. The apps tab is very usefull too. Sidenote: even if UCS2 limits us to the BMP, we can still create strings in AutoIt that contain characters outside the BMP. We can for instance represent the Unicode character U+2F834 ("CJK COMPATIBILITY IDEOGRAPH-2F834") which encodes as 0xD87E 0xDC34 in UTF16-LE by coding it ChrW(0xD87E) & ChrW(0xDC34 ) but string functions will AFAIK consider that as a string of two (completely unrelated) characters. Still there? Now what happens with $s = "Joe" & Chr(0x92) & " garage" is a good question. Frankly ... I don't know exactly! Run this code: Local $c = Chr(0x92) ConsoleWrite(VarGetType($c) & @TAB & Binary($c) & @LF) $c = "A" & $c & "B" Local $a = StringRegExp($c, ".", 3) _ArrayDisplay($a) _ArrayDisplay(StringToASCIIArray($c)) You'll see that the last _ArrayDisplay tells us that Chr(0x92) is now the Unicode codepoint 8217 (decimal) or 0x2019. That's because the ANSI character 0x92 in my Latin1 extended ASCII codepage is mapped to the Unicode character U+2019. Unicode codepoint 0x92 is a control character named "PRIVATE USE TWO". So in fact AutoIt does indeed a good job in not changing the character in the string. It's only that extended ANSI codes not always map to the same Unicode codepoint. Insiders (Devs, @trancexx, mediums) can jump in here to explain in gory details how strings work in AutoIt but let me offer an "out of thin air" explanation: the string $c is kept in non-concatenated form (part Unicode, part ANSI) until it's passed to some functions (which?), at which time the non-Unicode part(s) are converted into their Unicode mapping. Conclusion: you see that while Unicode has solved the portability issue of texts in all human scripts ever used (and more: there is even a Klingon group) it also introduces some difficulties, and we've only scratched a very thin layer off a large surface here. To see the differences between Unicode and the specific ANSI codepage you use, use https://r12a.github.io/uniview/ and select the Latin, Basic & Latin1 supplement block, then compare the content to the Windows Charset applet in your codepage mode. There you can click on a given character to view it's Unicode properties and definitions. Latin1 ANSI ("western ANSI") differs only in the range 0x80-0x9F but your codepage may differ more. I hope this answers some of your questions and I apologize for having lost dozens of readers. Let them RIP.
    2 points
  2. Documentation for SafeArrayDisplay added to post 5 above. Code for SafeArrayDisplay added to zip-file in bottom of first post. Thanks for all the feedback. Always nice with some feedback.
    1 point
  3. ISI360

    ISN AutoIt Studio

    Currently...no. But you can completly disable the window (in the programm settings).
    1 point
  4. If the subject string actually contains extended ASCII characters of some ANSI codepage that are not mapped identically in Unicode, then they can only be filtered in/out using their Unicode codepoint value or directly using the literal character itself. Re-using the OP code both examples below work: #include <Array.au3> $buf = "First title" & @CRLF & "Tom" & Chr(0x92) & "s sleepwalking" & @CRLF & "Last | line" & @CRLF $items = StringRegExp($buf, '([\x20-\xff\x2019]+)\x0d\x0a', 3) _ArrayDisplay($items,'') $items = StringRegExp($buf, '([\x20-\xff’]+)\x0d\x0a', 3) _ArrayDisplay($items,'') Of course if the goal is to split the input on line breaks, using a regexp like this is both an overkill and prone to failure. StringSplit would do the job fine, as well as $items = StringRegExp($buf, '(.*)\R', 3) Last note: given the fact that many distinct single or double Unicode quotes, apostrophes and prime (and accents and diacritics !!!) are very closely similar-looking despite having different codepoints, I recommend being very careful not to confuse between them. To wit (and I didn't put them all): '´ʹʼʽʻʾʿ̀`ՙ՝᾿῀´῾‘’‛“”‟′″‴‵‶‷ Have fun telling which is which just by looking! .
    1 point
  5. _WinAPI_WindowFromPoint _WinAPI_GetClassName ControlCommand ControlGetText etc..
    1 point
  6. So you want to automate an automation tool... How about a detailed explanation of what you're trying to do, rather than the "kinda like this, something like a dll explanation"? The more we know the better we can assist.
    1 point
  7. Does this actually work? You seem to overwrite your $MasterArray each time you loop through the file list. Also you need to identify which dimension to change currently you're using 1d rather than 2d when expanding the columns, so it should be something like: #include <Array.au3> #include <File.au3> Global $aMasterArray RefineData() Func RefineData() Local $aCsvFile, $sFilePath = @ScriptDir Local $aFileList = _FileListToArrayRec($sFilePath, "*.csv", $FLTA_FILES, $FLTAR_NORECUR, $FLTAR_NOSORT, $FLTAR_FULLPATH) ;Create and array of all .csv files within folder ;=====Loop through the .csv files within the folder====== For $i = 1 To $aFileList[0] ;=====Create array based on csv file===== _FileReadToArray($aFileList[$i], $aCsvFile, $FRTA_NOCOUNT, ",") If $i = 1 Then $aMasterArray = $aCsvFile _ArrayDisplay($aMasterArray, "Master") Else _ArrayColInsert($aMasterArray, UBound($aMasterArray, 2)) ;want column added at end For $j = 0 To UBound($aMasterArray) - 1 $aMasterArray[$j][UBound($aMasterArray, 2) - 1] = $aCsvFile[$j][4] Next _ArrayDisplay($aMasterArray, "Master") EndIf Next EndFunc ;==>RefineData
    1 point
  8. See what sFilePath is after this line each iteration of the loop. That's your problem $sFilePath = $sFilePath & "\" & $file
    1 point
  9. This is basically two functions (plus supporting infrastructure): _Xbase_ReadToArray($filename, Byref $array, [...]) _Xbase_WriteFromArray($filename, Byref $array, [...]) that transfer all data from one container to the other. Various optional formatting parameters are detailed in the Remarks section of the script; a small test script plus dbf test file (the latter from the (free) Harbour distribution, see here) are provided for your entertainment. Note that the $array variable has to exist already, as it's parsed ByRef, but it does not have to have the correct dimensions. This is pure AutoIt (no SQL, no ADO, no dlls, no external dependencies). The Xbase specification was gleaned from here. There is no support (either currently or planned) for a GUI, or additional functionality; it's just a simple data interface I needed for my MatrixFileConverter for Eigen4Autoit (link in signature), that might be of use to others (see MatrixFileConverter.au3 for an implementation example). One thing to keep in mind is AutoIt's array size limitation (16 MB elements); Xbase files can be considerably larger. On the other hand, AutoIt arrays are less restricted in their number of columns than (certain versions of) Xbase (see script for details). Appropriate error messages will inform you when you've hit these buffers. Xbase.v0.8.7z third beta release
    1 point
  10. water

    Active Directory UDF

    I have converted and extended the adfunctions.au3 written by Jonathan Clelland to a full AutoIt UDF including help file, examples, ScITE integration etc. The example scripts should run fine without changes. 2016-08-18: Version: 1.4.6.0 As always: Please test before using in production! KNOWN BUGS: (Last changed: ) None AD 1.4.6.0.zip For AutoIt >= 3.3.12.0 AD 1.4.0.0.zip other versions of AutoIt
    1 point
×
×
  • Create New...