Modify

Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#3896 closed Bug (No Bug)

Single non-ANSI character failure in BinaryToString

Reported by: anonymous Owned by:
Milestone: Component: AutoIt
Version: 3.3.16.0 Severity: None
Keywords: Cc:

Description

Single character failure in BinaryToString.
BinaryToString returns with appending an extra 0x0000 when converting a single non-ANSI character code to UTF-8.
BinaryToString returns properly converting a longer character code.

Demo script included.

Change History (7)

comment:1 by anonymous, 4 years ago

#cs ----------------------------------------------------------------------------

AutoIt Version: 3.3.16.0
Author: myName

Script Function:

Template AutoIt script.

#ce ----------------------------------------------------------------------------

; Script Start - Add your code below here

; Bug in BinaryToString with single non-ANSI character
$single_r_hacek = 0x99C5 ; LE UTF-code C599 for ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI
$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; gives WRONG output 3 instead of 1
MsgBox(0, "", $UTF8_single_r_hacek) ; correct ř output
FileWrite("single_test.txt", $UTF8_single_r_hacek) ; WRONG 0xC5990000 in file
$binfile = FileOpen("single_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_single_r_hacek) ; WRONG 0x720000 in binary file (ANSI/ASCII r)
FileClose($binfile)

$double_r_hacek = 0x99C599C5
$UTF8_double_r_hacek = BinaryToString($double_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_double_r_hacek)) ; gives correct output 2
MsgBox(0, "", $UTF8_double_r_hacek) ; correct řř output
FileWrite("double_test.txt", $UTF8_double_r_hacek) ; correct 0xC599C599 in file
$binfile = FileOpen("double_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_double_r_hacek) ; WRONG 0x7272 in binary file
FileClose($binfile)

comment:2 by J-Paul Mesnage, 4 years ago

Resolution: No Bug
Status: newclosed

Hi,
your script is wrong
as the $single_r_hacek is not a "Binary" when use in BinaryToString()
at least it must be StringToBinary(ChrW(0x99C5))

after that the write to file have to be changed too

if with your script try triple you get wrong result too.

comment:3 by jchd18, 4 years ago

You were trying to feed UTF8 hex to wrong function(s).

; CW() is a Unicode-aware ConsoleWrite()
Local $single_r_hacek = ChrW(0x159) ; UTF16-LE code ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI
CW($single_r_hacek & @TAB & StringLen($single_r_hacek))
MsgBox(0, "", $single_r_hacek)

; making it simpler
$single_r_hacek = 'ř'
CW($single_r_hacek & @TAB & StringLen($single_r_hacek))
MsgBox(0, "", $single_r_hacek)

comment:4 by anonymous, 4 years ago

Thanks,

I expected

$single_r_hacek = 0x99C5

to return a Binary, but Binary() is required to do so:

$single_r_hacek = 0x99C5
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Int32, 4 bytes, final 00 marking EOT,

hence StringLen returned 3.

$single_r_hacek = Binary("0xC599")
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Binary

$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4)
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; correct output 1

Modify Ticket

Action
as closed The ticket will remain with no owner.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.