#3896 closed Bug (No Bug)
Single non-ANSI character failure in BinaryToString
| Reported by: | anonymous | Owned by: | |
|---|---|---|---|
| Milestone: | Component: | AutoIt | |
| Version: | 3.3.16.0 | Severity: | None |
| Keywords: | Cc: |
Description
Single character failure in BinaryToString.
BinaryToString returns with appending an extra 0x0000 when converting a single non-ANSI character code to UTF-8.
BinaryToString returns properly converting a longer character code.
Demo script included.
Attachments (3)
Change History (7)
by , 4 years ago
| Attachment: | Bug in BinaryToString for single non-ANSI character.au3 added |
|---|
comment:1 by , 4 years ago
by , 4 years ago
| Attachment: | Bug in BinaryToString for single non-ANSI character.2.au3 added |
|---|
by , 4 years ago
| Attachment: | Bug in BinaryToString for single non-ANSI character (correct file).au3 added |
|---|
comment:2 by , 4 years ago
| Resolution: | → No Bug |
|---|---|
| Status: | new → closed |
Hi,
your script is wrong
as the $single_r_hacek is not a "Binary" when use in BinaryToString()
at least it must be StringToBinary(ChrW(0x99C5))
after that the write to file have to be changed too
if with your script try triple you get wrong result too.
comment:3 by , 4 years ago
You were trying to feed UTF8 hex to wrong function(s).
; CW() is a Unicode-aware ConsoleWrite() Local $single_r_hacek = ChrW(0x159) ; UTF16-LE code ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI CW($single_r_hacek & @TAB & StringLen($single_r_hacek)) MsgBox(0, "", $single_r_hacek) ; making it simpler $single_r_hacek = 'ř' CW($single_r_hacek & @TAB & StringLen($single_r_hacek)) MsgBox(0, "", $single_r_hacek)
comment:4 by , 4 years ago
Thanks,
I expected
$single_r_hacek = 0x99C5
to return a Binary, but Binary() is required to do so:
$single_r_hacek = 0x99C5
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Int32, 4 bytes, final 00 marking EOT,
hence StringLen returned 3.
$single_r_hacek = Binary("0xC599")
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Binary
$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4)
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; correct output 1

#cs ----------------------------------------------------------------------------
#ce ----------------------------------------------------------------------------
; Script Start - Add your code below here
; Bug in BinaryToString with single non-ANSI character
$single_r_hacek = 0x99C5 ; LE UTF-code C599 for ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI
$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; gives WRONG output 3 instead of 1
MsgBox(0, "", $UTF8_single_r_hacek) ; correct ř output
FileWrite("single_test.txt", $UTF8_single_r_hacek) ; WRONG 0xC5990000 in file
$binfile = FileOpen("single_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_single_r_hacek) ; WRONG 0x720000 in binary file (ANSI/ASCII r)
FileClose($binfile)
$double_r_hacek = 0x99C599C5
$UTF8_double_r_hacek = BinaryToString($double_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_double_r_hacek)) ; gives correct output 2
MsgBox(0, "", $UTF8_double_r_hacek) ; correct řř output
FileWrite("double_test.txt", $UTF8_double_r_hacek) ; correct 0xC599C599 in file
$binfile = FileOpen("double_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_double_r_hacek) ; WRONG 0x7272 in binary file
FileClose($binfile)