Myicq Posted October 27, 2011 Share Posted October 27, 2011 (edited) I need to develop an app that talks to a non-unicode application. This application (graphic) can set font encoding on different objects, so my task is bascially Read string in Excel 2003 or 2007. This string will be UniCodeConvert each character in the string to ANSI (Win12xx)Either save the converted string to a text file or send the string by Ethernet socketTo give an example of point 2: Take the string [Москва] In Unicode this would be 041C 043E 0441 043A 0432 0430 (PROBABLY encoded as 1C 04 3E 04 etc since Excel uses Little Endian) This can be converted to Win1251 according to http://en.wikipedia.org/wiki/Windows-1251 End result would be CC EE F1 EA E2 E0 My question is: has someone done a / know of a UDF to do this job (feed a string of LE bytes in, get a string of ANSI bytes out) - or is there a native Windows function ? If someone could post a small example going from $instring = Москва I would be really happy for any help Attached is a ZIP with two text files. Unicode BOM is the source format. Win1251 destination format.moskva.zip Edited October 27, 2011 by Myicq I am just a hobby programmer, and nothing great to publish right now. Link to comment Share on other sites More sharing options...
guinness Posted October 27, 2011 Share Posted October 27, 2011 (edited) Func _UnicodeToANSI($sString) #cs Local Const $SF_ANSI = 1 Local Const $SF_UTF16_LE = 2 Local Const $SF_UTF16_BE = 3 Local Const $SF_UTF8 = 4 #ce Local Const $SF_ANSI = 1, $SF_UTF8 = 4 Return BinaryToString(StringToBinary($sString, $SF_UTF8), $SF_ANSI) EndFunc ;==>_UnicodeToANSI Edited October 27, 2011 by guinness Zedna and Myicq 2 UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018 Link to comment Share on other sites More sharing options...
jchd Posted October 27, 2011 Share Posted October 27, 2011 Presumably, the Excel string was obtained via COM, so it's a native (~UTF-16 LE) string, not UTF-8.Even if it comes from a file, then it's probably be UTF-8 in the file, but at the time where it's read by AutoIt, it should become a native AutoIt (~UTF-16LE) string. Myicq 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Myicq Posted October 28, 2011 Author Share Posted October 28, 2011 Thank you both for the reply The solution was right there and so simple. Will try to post sample code once I have it. I am just a hobby programmer, and nothing great to publish right now. Link to comment Share on other sites More sharing options...
Myicq Posted October 28, 2011 Author Share Posted October 28, 2011 (edited) An update: Tried the code and it does not give the expected result. Can any of you provide a correction to this code: $o = FileOpen(@ScriptDir& "\out.txt",2) ; ANSI, Write and replace $s = "русский язык" ; read from Excel normally... FileWriteline ($o, _UnicodeToANSI( $s )) fileclose($o) The resulting files contain the bytes from the original source string, so in this case from the start 0x40 0x04 0x43 0x04 0x41 0x04 .. which is exactly the bytes from the unicode characters encoded in UTF16-LE. Seems like the string was not converted at all. I would have expected instead to see 0xF0 0xF3 ... So could you provide a more complete example that shows how to get from UTF16 to a file in ANSI encoding ? Thanks Edited October 28, 2011 by Myicq I am just a hobby programmer, and nothing great to publish right now. Link to comment Share on other sites More sharing options...
jchd Posted October 28, 2011 Share Posted October 28, 2011 You'll find that the code below does what you're after. Instead of relying on the fact that your machine/user probably has the expected locale setting (which determines which ANSI codepage to use by default), I forced ANSI 1251 as you can see. Change that according to your needs. $o = FileOpen(@ScriptDir& "\out.txt", 2) ; ANSI, Write and replace $s = "русский язык" ; read from Excel normally... FileWriteline ($o, _UTF16toANSI1251( $s )) FileClose($o) Func _UTF16toANSI1251($sString) Local $aResult = DllCall("kernel32.dll", "int", "WideCharToMultiByte", "uint", 1251, "dword", 0, "wstr", $sString, "int", -1, _ "ptr", 0, "int", 0, "ptr", 0, "ptr", 0) Local $tText = DllStructCreate("char[" & $aResult[0] & "]") $aResult = DllCall("Kernel32.dll", "int", "WideCharToMultiByte", "uint", 1251, "dword", 0, "wstr", $sString, "int", -1, _ "ptr", DllStructGetPtr($tText), "int", $aResult[0], "ptr", 0, "ptr", 0) Return(DllStructGetData($tText, 1)) EndFunc Wiliat87 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Myicq Posted October 28, 2011 Author Share Posted October 28, 2011 Thanks for the response jchd. I can't really see through why your solution works (the DLL calls are quite complex), but since I need to convert between unicode and different codepages, I chose a different approach.. This code will give the correct result: a simple file containing just the ansi characters. Depending on the codepage, these characters look like gibberish, but the byte values are correct. For the record, the machine to use the file can set codepage on a per-line base, so no worry there. This solution may be a lot slower bcs of database calls, but speed is no issue and it's just a few lines. ; =============== ; Search for the data in XLS file ; =============== Func _do_data() ;_ExcelReadSheetToArray($oExcel [, $iStartRow = 1 [, $iStartColumn = 1 [, $iRowCnt = 0 [, $iColCnt = 0 [, $iColShift = False]]]]]) $o = FileOpen(@ScriptDir& "\out.txt",2) ; ANSI, Write and replace, BINARY $rowstart = 9 $colstart = 1 $arr = _ExcelReadSheetToArray($oExcel, $rowstart, $colstart, 1) ;read 1 row only filewriteline ($o, _u2a($arr[1][4])) filewriteline ($o, _u2a($arr[1][5])) fileclose($o) EndFunc[/font] ;============================================= ; this function makes the conversion using an SQLite Database ; format of database: ; unicode(decimal), ansi(hexvalue), codepage, name of char, unicode(hex) ;============================================= [font=courier new,courier,monospace]Func _u2a($u) Local $aRow ; u = unicodestring, UTF16LE local $b ; will contain lookup numbers local $ansi = "" ; will contain converted result $b = StringToASCIIArray($u) if IsArray($b) Then ; check for empty string here.. for $i = 0 to ubound($b)-1 _SQLite_QuerySingleRow(-1,"SELECT ansi FROM [u2a] where unicode='" & $b[$i] & "';", $aRow) $ansi = $ansi & chr(number(($arow[0]))) next endif return $ansi EndFunc Database (.sqlite format) attached.Unicode_2_ansi.zip I am just a hobby programmer, and nothing great to publish right now. Link to comment Share on other sites More sharing options...
jchd Posted October 29, 2011 Share Posted October 29, 2011 Ahem ... in your script you don't SELECT on cp at all, so expect random results!If you actually select on char code and cp, then make a compound index on these. But why don't you use the much simpler and faster function using WideCharToMultiByte and pass the codepage as well?There is no magic in this function: this is what Windows uses to convert to/from any string representation. The first call is needed to get the output length, the second call perform the actual conversion. You can use all the codepages listed here. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now