Arilvv Posted February 17, 2006 Share Posted February 17, 2006 There are four functions: Asc2Unicode($AscString) Unicode2Asc($UnicodeString) Unicode2Utf8($UnicodeString) Utf82Unicode($Utf8String) Note: $AscString and $Utf8String are normal strings, and $UnicodeString should be a binarystring Tested under Chinese Tradition (Taiwan) environment and worked fine. Hope it support all language system. expandcollapse popupFunc Asc2Unicode($AscString) Local $BufferSize = StringLen($AscString) * 2 Local $Buffer = DllStructCreate("byte[" & $BufferSize & "]") Local $Return = DllCall("Kernel32.dll", "int", "MultiByteToWideChar", _ "int", 0, _ "int", 0, _ "str", $AscString, _ "int", StringLen($AscString), _ "ptr", DllStructGetPtr($Buffer), _ "int", $BufferSize) Local $UnicodeString = StringLeft(DllStructGetData($Buffer, 1), $Return[0] * 2) $Buffer = 0 Return $UnicodeString EndFunc Func Unicode2Asc($UniString) If Not IsBinaryString($UniString) Then SetError(1) Return $UniString EndIf Local $BufferLen = StringLen($UniString) Local $Input = DllStructCreate("byte[" & $BufferLen & "]") Local $Output = DllStructCreate("char[" & $BufferLen & "]") DllStructSetData($Input, 1, $UniString) Local $Return = DllCall("kernel32.dll", "int", "WideCharToMultiByte", _ "int", 0, _ "int", 0, _ "ptr", DllStructGetPtr($Input), _ "int", $BufferLen / 2, _ "ptr", DllStructGetPtr($Output), _ "int", $BufferLen, _ "int", 0, _ "int", 0) Local $AscString = DllStructGetData($Output, 1) $Output = 0 $Input = 0 Return $AscString EndFunc Func Unicode2Utf8($UniString) If Not IsBinaryString($UniString) Then SetError(1) Return $UniString EndIf Local $UniStringLen = StringLen($UniString) Local $BufferLen = $UniStringLen * 2 Local $Input = DllStructCreate("byte[" & $BufferLen & "]") Local $Output = DllStructCreate("char[" & $BufferLen & "]") DllStructSetData($Input, 1, $UniString) Local $Return = DllCall("kernel32.dll", "int", "WideCharToMultiByte", _ "int", 65001, _ "int", 0, _ "ptr", DllStructGetPtr($Input), _ "int", $UniStringLen / 2, _ "ptr", DllStructGetPtr($Output), _ "int", $BufferLen, _ "int", 0, _ "int", 0) Local $Utf8String = DllStructGetData($Output, 1) $Output = 0 $Input = 0 Return $Utf8String EndFunc Func Utf82Unicode($Utf8String) Local $BufferSize = StringLen($Utf8String) * 2 Local $Buffer = DllStructCreate("byte[" & $BufferSize & "]") Local $Return = DllCall("Kernel32.dll", "int", "MultiByteToWideChar", _ "int", 65001, _ "int", 0, _ "str", $Utf8String, _ "int", StringLen($Utf8String), _ "ptr", DllStructGetPtr($Buffer), _ "int", $BufferSize) Local $UnicodeString = StringLeft(DllStructGetData($Buffer, 1), $Return[0] * 2) $Buffer = 0 Return $UnicodeString EndFunc Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted February 17, 2006 Moderators Share Posted February 17, 2006 I can see this being totally handy... Nice Work!! Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Fossil Rock Posted April 15, 2006 Share Posted April 15, 2006 Can you post a simple example reading a line of unicode from a text file and outting the text as asc? I have a feeling this won't accomplish what I would like because AU3 doesn't read unicode. Agreement is not necessary - thinking for one's self is! Link to comment Share on other sites More sharing options...
Lazycat Posted April 17, 2006 Share Posted April 17, 2006 Good work! Almost all functions works fine for me except Asc2Unicode - it's give unicode stream, but without BOM. Some programs not understand such stream. Changed this function a bit, added ability to add BOM to beginning of stream. Func Asc2Unicode($AscString, $addBOM = false) Local $BufferSize = StringLen($AscString) * 2 Local $FullUniStr = DllStructCreate("byte[" & $BufferSize + 2 & "]") Local $Buffer = DllStructCreate("byte[" & $BufferSize & "]", DllStructGetPtr($FullUniStr) + 2) Local $Return = DllCall("Kernel32.dll", "int", "MultiByteToWideChar", _ "int", 0, _ "int", 0, _ "str", $AscString, _ "int", StringLen($AscString), _ "ptr", DllStructGetPtr($Buffer, 1), _ "int", $BufferSize) DllStructSetData($FullUniStr, 1, 0xFF, 1) DllStructSetData($FullUniStr, 1, 0xFE, 2) If $addBOM then Return DllStructGetData($FullUniStr, 1) Else Return DllStructGetData($Buffer, 1) Endif EndFunc Don't dig it deep, maybe this can be done simpler. Koda homepage ([s]Outdated Koda homepage[/s]) (Bug Tracker)My Autoit script page ([s]Outdated mirror[/s]) Link to comment Share on other sites More sharing options...
NightGaunt Posted April 17, 2006 Share Posted April 17, 2006 I JUST had a problem that this was the solution to the problem. I also have eastern language support and occasional use of double byte characters (some Japanese customers). I'll let you know if I run in to any issues after I rework my code. Nice work. "I have discovered that all human evil comes from this, man's being unable to sit still in a room. " - Blaise Pascal Link to comment Share on other sites More sharing options...
Fossil Rock Posted April 18, 2006 Share Posted April 18, 2006 Is this right? I get nothing back. $File1 = FileOpen("C:\Unicode.txt", 0) $Unicode = FileReadLine($File1, 1) Unicode2Asc($Unicode) FileClose ($File1) MsgBox(0,"",$AscString) File used is attached.Unicode.txt Agreement is not necessary - thinking for one's self is! Link to comment Share on other sites More sharing options...
Lazycat Posted April 18, 2006 Share Posted April 18, 2006 Is this right? I get nothing back. Yes, this is right. This is unicode text, which Autoit can't read natively, so you should read it as binary. $File1 = FileOpen("C:\Unicode.txt", 4); 4 - raw read mode $Unicode = FileRead($File1, FileGetSize("C:\Unicode.txt")) $AscString = Unicode2Asc($Unicode) FileClose ($File1) MsgBox(0,"", $AscString) Koda homepage ([s]Outdated Koda homepage[/s]) (Bug Tracker)My Autoit script page ([s]Outdated mirror[/s]) Link to comment Share on other sites More sharing options...
Fossil Rock Posted April 18, 2006 Share Posted April 18, 2006 Cool, that works for me with two small problems... The first character is not part of the text file and I need it to read only one line at a time. It's reading the whole file, neither problem should be too difficult to correct.Anyone needing to convert UNICODE to ASCII / ANSI THIS WORKS!!!I don't have a need for the other functions at this time, but I'm sure they work just as well.Thanks for your efforts, they're greatly appreciated. Agreement is not necessary - thinking for one's self is! Link to comment Share on other sites More sharing options...
Fossil Rock Posted April 18, 2006 Share Posted April 18, 2006 Well I guess it's harder than I first thought. Is there a way to read a line at a time when using the binary mode? Agreement is not necessary - thinking for one's self is! Link to comment Share on other sites More sharing options...
Lazycat Posted April 19, 2006 Share Posted April 19, 2006 (edited) Well I guess it's harder than I first thought.Is there a way to read a line at a time when using the binary mode?Just simple read - no, of course. But it's possible to write UDF that will scan file for line ends (it's 0D 00 0A 00 in unicode) and read only data before it.But if file size not too big, imo simpler to convert it to ANSI before use. Edited April 19, 2006 by Lazycat Koda homepage ([s]Outdated Koda homepage[/s]) (Bug Tracker)My Autoit script page ([s]Outdated mirror[/s]) Link to comment Share on other sites More sharing options...
dabus Posted August 23, 2006 Share Posted August 23, 2006 This is awesome. I got to do a data-migration between two domains and things should go just automatic. subinacl, your (close-to-)udf and some tweaks made things a lot easier. Link to comment Share on other sites More sharing options...
SunDog Posted January 18, 2007 Share Posted January 18, 2007 Well down Link to comment Share on other sites More sharing options...
zhao Posted May 19, 2007 Share Posted May 19, 2007 Is there any way to convert UTF-8 to GB2313? Link to comment Share on other sites More sharing options...
AngelSL Posted June 7, 2007 Share Posted June 7, 2007 (edited) Chinese throws me '0xB2' I'm using Chinese IME, Chinese PRC. Heres the code: expandcollapse popup#Compiler_Res_Fileversion = 1.00 #Compiler_Res_LegalCopyright = NOT FOR RELEASE #Compiler_Res_Comment = NOT FOR RELEASE #Compiler_Res_Description = NOT FOR RELEASE #Compiler_Allow_Decompile = n #include <GUIConstants.au3> #include <_Encoding.au3> #include <String.au3> $Do = GUICreate("What To Do?", 335, 66, 193, 115) $To = GUICtrlCreateLabel("Now, do:", 8, 10, 55, 17) $Thing = GUICtrlCreateInput("", 64, 8, 265, 21, 0x800) GUICtrlSetBkColor($Thing, 0xFFFFFF) $Another = GUICtrlCreateButton("Another thing to do, please!", 8, 32, 321, 31, 0) _WhatToDoToday() GUISetState(@SW_SHOW) While 1 $nMsg = GUIGetMsg() Switch $nMsg Case $GUI_EVENT_CLOSE Exit Case $Another _WhatToDoToday() EndSwitch WEnd Func _WhatToDoToday() $ran = Random(1, 500, 1) $1 = Utf82Unicode("ÎÒ²»ÖªµÀ") Switch $ran Case 1 to 9 $TTDT = $1 Case 10 to 49 $TTDT = $1 Case 50 to 99 $TTDT = $1 Case 100 to 200 $TTDT = $1 Case 201 to 300 $TTDT = "Orbis PQ, foruming, chatting and AutoIt!" Case 301 to 400 $TTDT = "PQ" Case 401 to 499 $TTDT = "Kerning PQ" Case Else $TTDT = "Whatever you like!" EndSwitch GUICtrlSetData($Thing, $TTDT) EndFunc Edited June 7, 2007 by AngelSL Link to comment Share on other sites More sharing options...
lsakizada Posted August 6, 2007 Share Posted August 6, 2007 Why this little command line is not working? The UTF-8 file does not display well the characters. Any idea whats wrong?? To test it, run from the command line the following syntax > U2UTF8.exe 'Path to Source Unicode File' 'Path to Destination UTF-8 File' You need to comile it first .... expandcollapse popupif $CmdLine[0] <> 2 then MsgBox(0,0,"Uses: U2UTF8 'Path to Source Unicode File' 'Path to Destination UTF-8 File' ") exit EndIf Dim $UnicodeFile = $CmdLine[1] Dim $UTF8FILE = $CmdLine[2] $File1 = FileOpen($UnicodeFile, 16); 4 - raw read mode $Unicode = FileRead($File1, FileGetSize($UnicodeFile)) $UTF8String = Unicode2Utf8($Unicode) FileClose ($File1) ;MsgBox(0,"", $UTF8FILE) ;MsgBox(0,"", $UTF8FILE) $file = FileOpen($UTF8FILE,128+2) If $file = -1 Then MsgBox(0, "Error", "Unable to open file.") ; Exit EndIf FileWrite($file, $UTF8String) FileClose ($file) Func Unicode2Utf8($UniString) If Not IsBinary($UniString) Then SetError(1) MsgBox(0,0,"not binary") Return $UniString EndIf Local $UniStringLen = StringLen($UniString) Local $BufferLen = $UniStringLen * 2 Local $Input = DllStructCreate("byte[" & $BufferLen & "]") Local $Output = DllStructCreate("char[" & $BufferLen & "]") DllStructSetData($Input, 1, $UniString) Local $Return = DllCall("kernel32.dll", "int", "WideCharToMultiByte", _ "int", 65001, _ "int", 0, _ "ptr", DllStructGetPtr($Input), _ "int", $UniStringLen / 2, _ "ptr", DllStructGetPtr($Output), _ "int", $BufferLen, _ "int", 0, _ "int", 0) Local $Utf8String = DllStructGetData($Output, 1) $Output = 0 $Input = 0 Return $Utf8String EndFunc Be Green Now or Never (BGNN)! Link to comment Share on other sites More sharing options...
Dhilip89 Posted October 11, 2007 Share Posted October 11, 2007 (edited) Sorry for bumping old topic, I think something wrong in the code. Func Utf82Unicode($Utf8String) Local $BufferSize = StringLen($Utf8String) * 2 Local $Buffer = DllStructCreate("byte[" & $BufferSize & "]") Local $Return = DllCall("Kernel32.dll", "int", "MultiByteToWideChar", _ "int", 65001, _ "int", 0, _ "str", $Utf8String, _ "int", StringLen($Utf8String), _ "ptr", DllStructGetPtr($Buffer), _ "int", $BufferSize) Local $UnicodeString = StringLeft(DllStructGetData($Buffer, 1), $Return[0] * 2) $Buffer = 0 Return $UnicodeString EndFuncoÝ÷ Ù«¢+ÙÕ¹UÑàÉU¹¥½ ÀÌØíUÑáMÑÉ¥¹¤(1½°ÀÌØí ÕÉM¥éôMÑÉ¥¹1¸ ÀÌØíUÑáMÑÉ¥¹¤¨È(1½°ÀÌØí ÕÈô±±MÑÉÕÑ ÉÑ ÅÕ½ÐíåÑlÅÕ½ÐìµÀìÀÌØí ÕÉM¥éµÀìÅÕ½ÐítÅÕ½Ðì¤(1½°ÀÌØíIÑÕɸô±± ±° ÅÕ½Ðí-ɹ°Ìȹ±°ÅÕ½Ðì°ÅÕ½Ðí¥¹ÐÅÕ½Ðì°ÅÕ½Ðí5ձѥ åÑQ½]¥ ¡ÈÅÕ½Ðì°|(ÅÕ½Ðí¥¹ÐÅÕ½Ðì°ØÔÀÀÄ°|(ÅÕ½Ðí¥¹ÐÅÕ½Ðì°À°|(ÅÕ½ÐíÍÑÈÅÕ½Ðì°ÀÌØíUÑáMÑÉ¥¹°|(ÅÕ½Ðí¥¹ÐÅÕ½Ðì°MÑÉ¥¹1¸ ÀÌØíUÑáMÑÉ¥¹¤°|(ÅÕ½ÐíÁÑÈÅÕ½Ðì°±±MÑÉÕÑÑAÑÈ ÀÌØí ÕȤ°|(ÅÕ½Ðí¥¹ÐÅÕ½Ðì°ÀÌØí ÕÉM¥é¤(1½°ÀÌØíU¹¥½MÑÉ¥¹ô±±MÑÉÕÑÑÑ ÀÌØí ÕȰĤ(ÀÌØí ÕÈôÀ(IÑÕɸÀÌØíU¹¥½MÑÉ¥¹)¹Õ¹ Edited October 12, 2007 by Dhilip89 [u]My Projects[/u]:General:WinShell (Version 1.6)YouTube Video Downloader Core (Version 2.0)Periodic Table Of Chemical Elements (Version 1.0)Web-Based:Directory Listing Script Written In AutoIt3 (Version 1.9 RC1)UDFs:UnicodeURL UDFHTML Entity UDF[u]My Website:[/u]http://dhilip89.hopto.org/[u]Closed Sources:[/u]YouTube Video Downloader (Version 1.3)[quote]If 1 + 1 = 10, then 1 + 1 ≠ 2[/quote] Link to comment Share on other sites More sharing options...
Luigi Posted August 28, 2013 Share Posted August 28, 2013 (edited) Hi, I search a code to convert a word or a character to unicode in format u00xx... likes json use. This topic is help me in this function, and I rewrite this: Question, this is a best way to do this? I can't understand because the original script reply 0x6100 and not 0x0061, why? What I want is a code to input a any character, and return the unicode format in u0001, u0002, u003f, like something... Thanks ; http://www.autoitscript.com/forum/topic/21815-functions-for-ascii-unicode-and-utf8-encoding/#entry174115 Local $character = 'a' ConsoleWrite(Asc2Unicode($character) & @LF); print \u0061 Func Asc2Unicode($input) If StringLen($input) <> 1 Then Return SetError(-1, -1, -1) Local $FullUniStr = DllStructCreate('byte[3]') Local $Buffer = DllStructCreate('byte[2]', DllStructGetPtr($FullUniStr) + 2) Local $Return = DllCall('Kernel32.dll', 'int', 'MultiByteToWideChar', _ 'int', 0, _ 'int', 0, _ 'str', $input, _ 'int', StringLen($input), _ 'ptr', DllStructGetPtr($Buffer, 1), _ 'int', 2) DllStructSetData($FullUniStr, 1, 0xFF, 1) DllStructSetData($FullUniStr, 1, 0xFE, 2) Local $temp = DllStructGetData($Buffer, 1) Return '\u' & StringMid($temp, 5, 2) & StringMid($temp, 3, 2) EndFunc ;==>Asc2Unicode Edited August 28, 2013 by detefon Visit my repository Link to comment Share on other sites More sharing options...
jchd Posted August 28, 2013 Share Posted August 28, 2013 Necroing a 6-7 years old thread in examples is not the best way to get help. Help forum exists for a reason. Furthermore, you rely on an old version and code that shouldn't have remained here as it offers ugly workaround to non-existing issues. AutoIt strings use UCS-2 encoding, a subset of UTF16-LE Unicode limited to one 16-bit word per character. AscW("x") returns the value of the Unicode codepoint passed. Just concatenate "u" with the Hex representation of that value and you're done. Search help file if needed for completing the homework. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Luigi Posted August 29, 2013 Share Posted August 29, 2013 Sorry for the mistake, this will not be repeated. Visit my repository Link to comment Share on other sites More sharing options...
ptrex Posted August 29, 2013 Share Posted August 29, 2013 @detefon The unicode representation of the letter "a" = U+0061 or JSON like 0061 http://www.utf8-chartable.de/ Unicode code point character UTF-8 (hex.) U+0061 a 61 LATIN SMALL A @jchd Did you mean? ConsoleWrite("--- \u" & Hex(AscW("a"),4) & @CRLF) Rgds ptrex Contributions :Firewall Log Analyzer for XP - Creating COM objects without a need of DLL's - UPnP support in AU3Crystal Reports Viewer - PDFCreator in AutoIT - Duplicate File FinderSQLite3 Database functionality - USB Monitoring - Reading Excel using SQLRun Au3 as a Windows Service - File Monitor - Embedded Flash PlayerDynamic Functions - Control Panel Applets - Digital Signing Code - Excel Grid In AutoIT - Constants for Special Folders in WindowsRead data from Any Windows Edit Control - SOAP and Web Services in AutoIT - Barcode Printing Using PS - AU3 on LightTD WebserverMS LogParser SQL Engine in AutoIT - ImageMagick Image Processing - Converter @ Dec - Hex - Bin -Email Address Encoder - MSI Editor - SNMP - MIB ProtocolFinancial Functions UDF - Set ACL Permissions - Syntax HighLighter for AU3ADOR.RecordSet approach - Real OCR - HTTP Disk - PDF Reader Personal Worldclock - MS Indexing Engine - Printing ControlsGuiListView - Navigation (break the 4000 Limit barrier) - Registration Free COM DLL Distribution - Update - WinRM SMART Analysis - COM Object Browser - Excel PivotTable Object - VLC Media Player - Windows LogOnOff Gui -Extract Data from Outlook to Word & Excel - Analyze Event ID 4226 - DotNet Compiler Wrapper - Powershell_COM - New Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now