Modify

Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#3896 closed Bug (No Bug)

Single non-ANSI character failure in BinaryToString

Reported by: anonymous Owned by:
Milestone: Component: AutoIt
Version: 3.3.16.0 Severity: None
Keywords: Cc:

Description

Single character failure in BinaryToString.
BinaryToString returns with appending an extra 0x0000 when converting a single non-ANSI character code to UTF-8.
BinaryToString returns properly converting a longer character code.

Demo script included.

Change History (7)

comment:1 Changed 3 years ago by anonymous

#cs ----------------------------------------------------------------------------

AutoIt Version: 3.3.16.0
Author: myName

Script Function:

Template AutoIt script.

#ce ----------------------------------------------------------------------------

; Script Start - Add your code below here

; Bug in BinaryToString with single non-ANSI character
$single_r_hacek = 0x99C5 ; LE UTF-code C599 for ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI
$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; gives WRONG output 3 instead of 1
MsgBox(0, "", $UTF8_single_r_hacek) ; correct ř output
FileWrite("single_test.txt", $UTF8_single_r_hacek) ; WRONG 0xC5990000 in file
$binfile = FileOpen("single_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_single_r_hacek) ; WRONG 0x720000 in binary file (ANSI/ASCII r)
FileClose($binfile)

$double_r_hacek = 0x99C599C5
$UTF8_double_r_hacek = BinaryToString($double_r_hacek, 4) ; convert to UTF-8 string
ConsoleWrite(StringLen($UTF8_double_r_hacek)) ; gives correct output 2
MsgBox(0, "", $UTF8_double_r_hacek) ; correct řř output
FileWrite("double_test.txt", $UTF8_double_r_hacek) ; correct 0xC599C599 in file
$binfile = FileOpen("double_test_bin.txt", 16 + 2)
FileWrite($binfile, $UTF8_double_r_hacek) ; WRONG 0x7272 in binary file
FileClose($binfile)

comment:2 Changed 3 years ago by Jpm

  • Resolution set to No Bug
  • Status changed from new to closed

Hi,
your script is wrong
as the $single_r_hacek is not a "Binary" when use in BinaryToString()
at least it must be StringToBinary(ChrW(0x99C5))

after that the write to file have to be changed too

if with your script try triple you get wrong result too.

comment:3 Changed 3 years ago by jchd18

You were trying to feed UTF8 hex to wrong function(s).

; CW() is a Unicode-aware ConsoleWrite()
Local $single_r_hacek = ChrW(0x159) ; UTF16-LE code ř U+0159 LATIN SMALL LETTER R WITH CARON, NOT in ANSI
CW($single_r_hacek & @TAB & StringLen($single_r_hacek))
MsgBox(0, "", $single_r_hacek)

; making it simpler
$single_r_hacek = 'ř'
CW($single_r_hacek & @TAB & StringLen($single_r_hacek))
MsgBox(0, "", $single_r_hacek)

comment:4 Changed 3 years ago by anonymous

Thanks,

I expected

$single_r_hacek = 0x99C5

to return a Binary, but Binary() is required to do so:

$single_r_hacek = 0x99C5
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Int32, 4 bytes, final 00 marking EOT,

hence StringLen returned 3.

$single_r_hacek = Binary("0xC599")
ConsoleWrite(VarGetType($single_r_hacek)) ; shows Binary

$UTF8_single_r_hacek = BinaryToString($single_r_hacek, 4)
ConsoleWrite(StringLen($UTF8_single_r_hacek)) ; correct output 1

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Modify Ticket

Action
as closed The ticket will remain with no owner.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.