daywalkereg Posted January 14, 2010 Share Posted January 14, 2010 i'm working on program that reads some files and transform them into specific format but i found that the files encode arabic letter in unicode formate that i can't convert to the original format (arabic) with autoit and cannot find any solution, something like this :u0628 , u0633 , u0625 or u064athe only thing that i found is this page u0628 so please any one help me with this 1 £0\\/3 |-|3® $0 |\\/|µ(|-| Link to comment Share on other sites More sharing options...
jchd Posted January 14, 2010 Share Posted January 14, 2010 i'm working on program that reads some files and transform them into specific format but i found that the files encode arabic letter in unicode formate that i can't convert to the original format (arabic) with autoit and cannot find any solution, something like this : u0628 , u0633 , u0625 or u064a the only thing that i found is this page u0628 so please any one help me with this I bet you are interpreting data captured from JSON format. Did you forget to mention there is a backslash before the u.... (\u11111\u2222\u3333) or is your data really like u1111u2222u3333 ? You can use this to get yous Unicode characters correctly, should work (I leave the \\ to account for a single backlash in your stream). $s = StringRegexpReplace($s, "\\u([[:xdigit:]]{2,4})", '" & chr(0x$1) & "') $s = Execute('"' & $s & '"') This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
daywalkereg Posted January 14, 2010 Author Share Posted January 14, 2010 (edited) I bet you are interpreting data captured from JSON format. Did you forget to mention there is a backslash before the u.... (\u11111\u2222\u3333) or is your data really like u1111u2222u3333 ? You can use this to get yous Unicode characters correctly, should work (I leave the \\ to account for a single backlash in your stream). $s = StringRegexpReplace($s, "\\u([[:xdigit:]]{2,4})", '" & chr(0x$1) & "') $s = Execute('"' & $s & '"') thanks for your reply but i have the chars in this formate "u0628\u0633 \u0625\u064a\u0647" so i just need to translate each code into its orignal form Edited January 14, 2010 by daywalkereg 1 £0\\/3 |-|3® $0 |\\/|µ(|-| Link to comment Share on other sites More sharing options...
jchd Posted January 15, 2010 Share Posted January 15, 2010 thanks for your reply but i have the chars in this formate "\u0628\u0633 \u0625\u064a\u0647" so i just need to translate each code into its orignal form Then, this will surely work: Local $str = "Here is your example codes:\u0634\u0628\u0633where you can find other text as well\u0625\u064a\u0647\u0638" Local $s = StringRegexpReplace($str, "\\u([[:xdigit:]]{2,4})", '"&chrw(0x$1)&"') $s = '"' & $s & '"' $s = Execute($s) _ConsoleWrite($s & @LF) It produces this Notice that I use a special version of ConsoleWrite able to display Unicode. This also need to switch the codepage of Scite to Unicode rather than User. Func _ConsoleWrite($sString) Local $aResult = DllCall("kernel32.dll", "int", "WideCharToMultiByte", "uint", 65001, "dword", 0, "wstr", $sString, "int", -1, _ "ptr", 0, "int", 0, "ptr", 0, "ptr", 0) If @error Then Return SetError(1, @error, 0) Local $tText = DllStructCreate("char[" & $aResult[0] & "]") $aResult = DllCall("Kernel32.dll", "int", "WideCharToMultiByte", "uint", 65001, "dword", 0, "wstr", $sString, "int", -1, _ "ptr", DllStructGetPtr($tText), "int", $aResult[0], "ptr", 0, "ptr", 0) If @error Then Return SetError(2, @error, 0) ConsoleWrite(DllStructGetData($tText, 1)) EndFunc Tell me if this works for you. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
daywalkereg Posted January 15, 2010 Author Share Posted January 15, 2010 OMG, that worked very well thank you very much 1 £0\\/3 |-|3® $0 |\\/|µ(|-| Link to comment Share on other sites More sharing options...
jchd Posted January 15, 2010 Share Posted January 15, 2010 Glad it helped. BTW say hello to your local tetrahedras For your information as well as for subsequent readers interessed with this thread, there is another pseudo-standard for embedding Unicode characters in a JSON or JSON-like stream, where characters are coded like in"uuuuuu\xabvvvvv", with \xab the hexadecimal representation of the character. I never encountered a 16-bit version of this (e.g. \xabcd) but it might be in use somewhere. The regexp to use is very easy to deduce: $s = StringRegexpReplace($s, "\\(x[[:xdigit:]]{2})", '"&chr(0$1)&"') If you wish to handle both cases \u and \x with 2 or 4 hex digits, simply use: $s = StringRegexpReplace($s, "\\([xu][[:xdigit:]]{2,4})", '"&chr(0$1)&"') It's a bit imprecise in that it will decode \uabc with only 3 hex digits without warning (I bet this is very unlikely to be used anywhere), but the \u or \x guard should make it relatively difficult to confuse with unexpected occurence in random text, especially given that \u or \x sequences are expected in the said text. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now