#3896 closed Bug (No Bug)
Single non-ANSI character failure in BinaryToString
Reported by: | anonymous | Owned by: | |
---|---|---|---|
Milestone: | Component: | AutoIt | |
Version: | 3.3.16.0 | Severity: | None |
Keywords: | Cc: |
Description
Single character failure in BinaryToString.
BinaryToString returns with appending an extra 0x0000 when converting a single non-ANSI character code to UTF-8.
BinaryToString returns properly converting a longer character code.
Demo script included.
Attachments (3)
Change History (7)
Changed 3 years ago by anonymous
comment:1 Changed 3 years ago by anonymous
Changed 3 years ago by anonymous
Changed 3 years ago by anonymous
comment:2 Changed 3 years ago by Jpm
- Resolution set to No Bug
- Status changed from new to closed
Hi,
your script is wrong
as the $single_r_hacek is not a "Binary" when use in BinaryToString()
at least it must be StringToBinary(ChrW(0x99C5))
after that the write to file have to be changed too
if with your script try triple you get wrong result too.
comment:3 Changed 3 years ago by jchd18
You were trying to feed UTF8 hex to wrong function(s).
; CW() is a Unicode-aware ConsoleWrite() Local $single_r_hacek = ChrW(0x159) ; UTF16-LE code ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI CW($single_r_hacek & @TAB & StringLen($single_r_hacek)) MsgBox(0, "", $single_r_hacek) ; making it simpler $single_r_hacek = 'ř' CW($single_r_hacek & @TAB & StringLen($single_r_hacek)) MsgBox(0, "", $single_r_hacek)
comment:4 Changed 3 years ago by anonymous
Thanks,
I expected
$single_r_hacek = 0x99C5
to return a Binary, but Binary() is required to do so:
$single_r_hacek = 0x99C5
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Int32, 4 bytes, final 00 marking EOT,
hence StringLen returned 3.
$single_r_hacek = Binary("0xC599")
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Binary
$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4)
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; correct output 1
Guidelines for posting comments:
- You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
- In-depth discussions should take place on the forum.
For more information see the full version of the ticket guidelines here.
#cs ----------------------------------------------------------------------------
#ce ----------------------------------------------------------------------------
; Script Start - Add your code below here
; Bug in BinaryToString with single non-ANSI character
$single_r_hacek = 0x99C5 ; LE UTF-code C599 for ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI
$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; gives WRONG output 3 instead of 1
MsgBox(0, "", $UTF8_single_r_hacek) ; correct ř output
FileWrite("single_test.txt", $UTF8_single_r_hacek) ; WRONG 0xC5990000 in file
$binfile = FileOpen("single_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_single_r_hacek) ; WRONG 0x720000 in binary file (ANSI/ASCII r)
FileClose($binfile)
$double_r_hacek = 0x99C599C5
$UTF8_double_r_hacek = BinaryToString($double_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_double_r_hacek)) ; gives correct output 2
MsgBox(0, "", $UTF8_double_r_hacek) ; correct řř output
FileWrite("double_test.txt", $UTF8_double_r_hacek) ; correct 0xC599C599 in file
$binfile = FileOpen("double_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_double_r_hacek) ; WRONG 0x7272 in binary file
FileClose($binfile)