Thopaga Posted August 18, 2019 Share Posted August 18, 2019 Hi. I cannot get the functions _HexToString and StringToHex to work anymore. Does anyone know if there has been a change lately? Is there any problem with the extended character set? Thanks. Here is example of my code: #include <MsgBoxConstants.au3> #include <String.au3> Local $sline01="",$sline02="",$line03="" $sline01 = _HexToString("C6D8C5E6F8E5") $sline02 = _StringToHex("ÆØÅæøå") $sline03 = _StringToHex("ABCabc") MsgBox(0, "Convert Hex and string",$sline01 & @CRLF & $sline02 & @CRLF & $sline03) Spoiler Sort a multi-dimensional array with multiple sort columns. Max 9 dimensions. Link to comment Share on other sites More sharing options...
Jury Posted August 18, 2019 Share Posted August 18, 2019 (edited) $sline01 = _HexToString("C386C398C385C3A6C3B8C3A5") "C6D8C5E6F8E5" is Windows 1252 encoding Edited August 18, 2019 by Jury Link to comment Share on other sites More sharing options...
jchd Posted August 18, 2019 Share Posted August 18, 2019 "C6D8C5E6F8E5" is a string of hex values of extended ASCII chars in (non-native string) "ÆØÅæøå". "ÆØÅæøå" is a Unicode string whose UTF8 representation is "C386C398C385C3A6C3B8C3A5" "ABCabc" is a Unicode string whose UTF8 representation is "414243616263" What do you actually have to process and what do you expect to obtain? This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Nine Posted August 18, 2019 Share Posted August 18, 2019 You would need to use this : MsgBox ($MB_SYSTEMMODAL,"",chr(dec("C6")) & chr(dec("D8")) & chr(dec("C5")) & chr(dec("E6")) & chr(dec("F8")) & chr(dec("E5"))) “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
Thopaga Posted August 18, 2019 Author Share Posted August 18, 2019 (edited) Thanks @Jury, @jchd and @Nine for your kind help - GENIUS PEOPLE. @Jury thanks, that made it very clear for me – I thought HexToString and StringToHex would work like Windows 1252 encoding.@jchd thanks, Unicode strings interesting to know. So that means only those who use the English alphabet will have any use of these functions. @Nine thanks, Chr(Dec(“C6”)) and Hex(Asc(“Æ”),2) seems to work perfectly. Edited August 18, 2019 by Thopaga Spoiler Sort a multi-dimensional array with multiple sort columns. Max 9 dimensions. Link to comment Share on other sites More sharing options...
jchd Posted August 19, 2019 Share Posted August 19, 2019 (edited) AutoIt strings are Unicode. In fact, not exactly: AutoIt precisely currently use the UCS2 character set, the subset of Unicode limited to the BMP (Basic Multilingual Plane). That means that every character in a native AutoIt string is represented by a 16-bit encoding unit. Windows uses Unicode charset. Both AutoIt and Windows use UTF16-LE (AutoIt being limited to codepoints 0x0000..0xFFFF) Asc() and Chr() use extended ASCII with locale setting (currently active Windows 8-bit charset) and this can be Latin1, Cyrillic, whatever. This is non-portable. AscW() and ChrW() use Unicode (or rather UCS2). Also lookup StringToBinary and BinaryToString in help. Edited August 19, 2019 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Nine Posted August 19, 2019 Share Posted August 19, 2019 @jchd I am trying to match the right character set with the actual result of : MsgBox($MB_SYSTEMMODAL, "", _StringToHex("ÆØÅæøå")) Looking at http://www.columbia.edu/kermit/ucs2.html, the result of the MsgBox does not correspond to values found in that chart. This site https://unicode.org/roadmaps/bmp/ gives same chart as the previous one. The value returned by _StringToHex points to some Asian syllabes, not the Scandinavian letters. What am I missing here ? “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
Thopaga Posted August 19, 2019 Author Share Posted August 19, 2019 (edited) Hi @Nine.@jchd will be the one to clarify everything.Here are a few links to get you started:http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=Æ&mode=charTell us that character Æ has UTF-8 = C386 and Hex Code Point = 00C6 http://www.columbia.edu/kermit/ucs2.htmlHere 00C6 above correspond to Æ above. My understanding is that UTF-8 chart and UCS-2 chart are two different things but in some way related.http://www.fileformat.info/info/charset/UTF-8/list.htm Let us wait for @jchd to explain everything. Edited August 19, 2019 by Thopaga Spoiler Sort a multi-dimensional array with multiple sort columns. Max 9 dimensions. Link to comment Share on other sites More sharing options...
Developers Jos Posted August 19, 2019 Developers Share Posted August 19, 2019 (edited) Some technical details to explain the difference between UCS-2/UTF16 and UTF8: To convert between them this table is used: Quote Bytes Bits Hex Minimum Hex Maximum Byte Sequence in Binary 1 7 00000000 0000007F 0xxxxxxx 2 11 00000080 000007FF 110xxxxx 10xxxxxx 3 16 00000800 0000FFFF 1110xxxx 10xxxxxx 10xxxxxx So when you take Æ = > 00C6 => 0000 0000 1100 0110 You need to use the 2 Bytes conversion: 110xxxxx 10xxxxxx Just replace the last 6 characters of byte 2 in that mask (x's) with the last 6 characters from the Æ character -> 000110 and replace the last 5 characters of byte 1 in that mask (x's) with the 5 characters from the Æ character in front of the previous 6 -> 00011 Giving: 11000011 10000110 Which is equal to Hex C386. Is this making any sense? some code to show the conversion hex values: $sString="Æ" ConsoleWrite('StringToBinary($sString) = ' & StringToBinary($sString,1) & @CRLF) ConsoleWrite('StringToBinary($sString) = ' & StringToBinary($sString,2) & @CRLF) ConsoleWrite('StringToBinary($sString) = ' & StringToBinary($sString,3) & @CRLF) ConsoleWrite('StringToBinary($sString) = ' & StringToBinary($sString,4) & @CRLF) Jos Edited August 19, 2019 by Jos Nine 1 SciTE4AutoIt3 Full installer Download page - Beta files Read before posting How to post scriptsource Forum etiquette Forum Rules Live for the present, Dream of the future, Learn from the past. Link to comment Share on other sites More sharing options...
jchd Posted August 19, 2019 Share Posted August 19, 2019 UCS2 is just Unicode limited to codepoints 0x0000..0xFFFF. UTF8 isn't a character set, but one of the possible encoding of Unicode codepoints. Other encodings in use: UTF16-LE (used by Windows), UTF16-BE, UTF32-LE and UTF32-BE. UTF16 uses 16-bit encoding units, UTF32 use 32-bit encoding units and of course UTF8 uses bytes. -LE and -BE refer to encoding units endianness. UTF7 is far long deprecated. Clearly, UCS2 uses 16-bit encoding units and needs only one per codepoint. UTF16 uses 1 to 2 16-bit words per codepoint, while UTF8 needs from 1 to 4 bytes per codepoint. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Nine Posted August 19, 2019 Share Posted August 19, 2019 3 hours ago, Jos said: Is this making any sense? Yep, I wouldn't have found it even if my life was depending on it “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now