mLipok Posted January 8, 2021 Share Posted January 8, 2021 (edited) I have an issue which I describe in this following story: I have a function which was using: Local $dStream = StringToBinary($sContent) instead of this I should use Local $dStream = StringToBinary($sContent, $iEncoding) And preset desire $iEncoding as default was $SB_ANSI but I should use $SB_UTF8 This function was using $dStream to put that content to SQL DB as BLOB. Unfortunately was stored as $SB_ANSI not as $SB_UTF8 When I retrive data from SQL DB I always use: $sTXT_Document_content = BinaryToString($dStream, $SB_UTF8) ......some time later........ I had to processing this content Retrive date: $sTXT_Document_content = BinaryToString($dStream, $SB_UTF8) procesing (simple @CR > @CRLF) ........ and store data: $dStream = StringToBinary($sTXT_Document_content, $SB_UTF8) ISSUE short description: Data was saved as $SB_ANSI .... Was retrived as $SB_UTF8 ... processed (simple @CR > @CRLF) .... stored as $SB_UTF8 Data which should be stored like: Quote tytułu: are displayed as: Quote tytuďż˝u: Request for help: I know how to get data and store data from/to SQL DB, but I was trying few ways and I can not revert encoding problem. Help please. mLipok Edited January 8, 2021 by mLipok quoted example fixed Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24 Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted January 8, 2021 Share Posted January 8, 2021 @mLipok What do you get when you run this script on your PC? #include <MsgBoxConstants.au3> #include <StringConstants.au3> Global $strANSI = "tytułu:", _ $bstrANSI = Binary($strANSI), _ $strUTF8 = BinaryToString($bstrANSI, $SB_UTF8) MsgBox($MB_ICONINFORMATION, "Encondings:", "$strANSI = " & $strANSI & @CRLF & _ "$bstrANSI = " & $bstrANSI & @CRLF & _ "$strUTF8 = " & $strUTF8) This is what I see: mLipok 1 Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
mLipok Posted January 10, 2021 Author Share Posted January 10, 2021 (edited) Here is example what was done/what happend: expandcollapse popup#include <MsgBoxConstants.au3> #include <StringConstants.au3> Global $FAKE_DATA_BASE _Example() Func _Example() Local $DATA0 = "tytułu:" _DB_Store_Wrong_Way($DATA0) MsgBox(0, @ScriptLineNumber, $FAKE_DATA_BASE) Local $DATA1 = _DB_GetData_Proper_Way() MsgBox(0, @ScriptLineNumber, $DATA1) _DB_Store_Proper_Way($DATA1) _DB_Imp() Local $DATA2 = _DB_GetData_Proper_Way() MsgBox(0, @ScriptLineNumber, $DATA2) _DB_FIX() Local $DATA3 = _DB_GetData_Proper_Way() # RESULT SHOULD BE "True" MsgBox(0, 'TEST', $DATA0 == $DATA3) EndFunc ;==>_Example Func _DB_Store_Wrong_Way($DATA) $FAKE_DATA_BASE = StringToBinary($DATA) EndFunc ;==>_DB_Store_Wrong_Way Func _DB_GetData_Proper_Way() Return BinaryToString($FAKE_DATA_BASE, $SB_UTF8) EndFunc ;==>_DB_GetData_Proper_Way Func _DB_Store_Proper_Way($DATA) $FAKE_DATA_BASE = StringToBinary($DATA, $SB_UTF8) EndFunc ;==>_DB_Store_Proper_Way Func _DB_Imp() Local $DATA = $FAKE_DATA_BASE $DATA = BinaryToString($DATA) _DB_Store_Proper_Way($DATA) EndFunc ;==>_DB_Imp Func _DB_FIX() # HERE should be done some stuff to fix $FAKE_DATA_BASE content EndFunc ;==>_DB_FIX I need to find out what to do inside: _DB_FIX() to get result True will back to the problem in next 24h. Edited January 10, 2021 by mLipok Signature beginning:* Please remember: "AutoIt"..... * Wondering who uses AutoIt and what it can be used for ? * Forum Rules ** ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Code * for other useful stuff click the following button: Spoiler Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST API * ErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 * My contribution to others projects or UDF based on others projects: * _sql.au3 UDF * POP3.au3 UDF * RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF * SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane * Useful links: * Forum Rules * Forum etiquette * Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * Wiki: * Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX IE Related: * How to use IE.au3 UDF with AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler * IE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related: * How to get reference to PDF object embeded in IE * IE on Windows 11 * I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions * EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *I also encourage you to check awesome @trancexx code: * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuff * OnHungApp handler * Avoid "AutoIt Error" message box in unknown errors * HTML editor * winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/ "Homo sum; humani nil a me alienum puto" - Publius Terentius Afer"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming" , be and \\//_. Anticipating Errors : "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty." Signature last update: 2023-04-24 Link to comment Share on other sites More sharing options...
pixelsearch Posted January 10, 2021 Share Posted January 10, 2021 Hi mLipok I wonder if it's possible to revert the encoding, please have a look at this : The letter ł (l with stroke) in $DATA0 has a Unicode code of 322 (decimal) as returned by AscW("ł") When it was wrongly encoded using the ANSI flag, then its Ascii code was used, which is 108 (0x6C) as returned by Asc("ł") Which means that "ł" has become an "l" (small letter l) and its Ascii code is displayed in MsgBox at line 13 of your script: $FAKE_DATA_BASE = 0x747974756C753A At this point, it seems impossible to reverse anything because the unicode code 322 doesn't exist anymore in the binary data. There is something I didn't understand when you say that you have this kind of display on your computer : tytuďż˝u: The only obscure display I can reproduce on my computer happens when I change Scite encoding properties (menu File => Encoding) from UTF8 to Code Page Property (ANSI) then "tytułu:" is displayed "tytuÅ‚u:" in the script. Explanation : Å is character Ascii 197 (0xC5) and ‚ is character ascii 130 (0x82) which is "a single low quotation mark" This corresponds to an encoding of "0x74797475C582753A" which is the correct binary data when encoding is done using UTF8 flag (as ł unicode's code 322 will be correctly encoded to 0xC582 in UTF8) Maybe our locales are different and that could explain the tytuďż˝u: display on your computer ? I hope you'll have other suggestions that will help you. mLipok 1 Link to comment Share on other sites More sharing options...
jchd Posted January 10, 2021 Share Posted January 10, 2021 (edited) @mLipok I have the exact same questions as @pixelsearch in his good answer above. When you say it displays as tytuďż˝u: you don't tell us what exactly you look at: a DOS display (which codepage?), a Windows display (which codepage?), a SciTE display (which codepage?)! I understand your initial issue and I have the same here where an old program I use 100 times a day shows "ANSIfied UTF8 text" as the program doesn't cope with UTF8. I've written a little AutoIt program to convert the clipboard content from " UTF16 string containing UTF8 data" back to UCS2. I didn't bother to handle codepoints beyond UCS2 because I never get such content. I insisted to handle the low-level conversion on a "UTF8-sequence" basis over 2 or 3 bytes. I had to first replace sequences " " ( then 0x20) by " " ( then 0xA0), and "à " (à then 0x20) by "à " (à then 0xA0) and also double single quotes to make them transparent to AutoIt string processing, hence the 3 nested StringReplace(). Mabe you can adapt the idea to your use case. HotKeySet("!w", _unUTFme) HotKeySet("^!w", _Exit) While 1 Sleep(250) WEnd Func _Exit() Exit EndFunc Func _unUTFme() ClipPut(Execute("'" & StringRegExpReplace(StringReplace(StringReplace(StringReplace(ClipGet(), " ", " "), "à ", "à "), "'", "''"), "([\xC0-\xDF].|[\xE0-\xEF]..)", "' & _UnUTF8('$1') & '") & "'")) EndFunc Func _UnUTF8($s) Local $a = StringToASCIIArray($s) Local $c If UBound($a) = 2 Then $c = BitOR(BitShift(BitAND($a[0], 0x1F), -6), BitAND($a[1], 0x3F)) Else $c = BitOR(BitShift(BitAND($a[0], 0x3F), -12), BitOR(BitShift(BitAND($a[1], 0x3F), -6), BitAND($a[2], 0x3F))) EndIf Return ChrW($c) EndFunc Edited January 10, 2021 by jchd mLipok 1 This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
TheXman Posted January 10, 2021 Share Posted January 10, 2021 (edited) Here's another possible solution. Since I don't know your language and the rules for how characters are translated from ANSI to UTF-8, I just used my own rules as an example. Also since I don't know your language, I'm not sure if trying to do it this way is even possible. Other than adding logic to the _DB_Fix() function and writing output to a GUI console instead of MsgBoxes, the script is exactly the same as your original script. Spoiler expandcollapse popup#AutoIt3Wrapper_AU3Check_Parameters=-w 3 -w 4 -w 5 -w 6 -d #include <Constants.au3> #include <GUIConstantsEx.au3> #include <WindowsConstants.au3> #include <EditConstants.au3> #Region GUI Console Const $CONSOLE_BACK_COLOR = 0xFFFFFF Const $CONSOLE_FORE_COLOR = 0x000000 Const $CONSOLE_WIDTH = 800 Const $CONSOLE_HEIGHT = 600 Global $frmConsole = GUICreate("DB Fix Console (Press ESC to Close)", $CONSOLE_WIDTH, $CONSOLE_HEIGHT, 302, 218, BitOR($GUI_SS_DEFAULT_GUI,$WS_MAXIMIZEBOX,$WS_SIZEBOX,$WS_THICKFRAME,$WS_TABSTOP)) GUISetBkColor($CONSOLE_BACK_COLOR, $frmConsole) Global $ctlEditBox = GUICtrlCreateEdit("", 0, 0, $CONSOLE_WIDTH, $CONSOLE_HEIGHT, BitOR($ES_AUTOVSCROLL,$ES_AUTOHSCROLL,$ES_WANTRETURN,$ES_READONLY,$WS_VSCROLL,$WS_HSCROLL)) GUICtrlSetResizing($ctlEditBox, $GUI_DOCKLEFT+$GUI_DOCKRIGHT+$GUI_DOCKTOP+$GUI_DOCKBOTTOM) GUICtrlSetFont($ctlEditBox, 11, 500, 0, "Consolas") GUICtrlSetColor($ctlEditBox, $CONSOLE_FORE_COLOR) GUICtrlSetBkColor($ctlEditBox, $CONSOLE_BACK_COLOR) GUISetState(@SW_SHOW, $frmConsole) #EndRegion GUI Console Global $FAKE_DATA_BASE _example() While 1 If GUIGetMsg() = $GUI_EVENT_CLOSE Then ExitLoop WEnd Func _example() Local $DATA0 = "Tytuł: to jest treść testowa" write_console_line(@ScriptLineNumber & ": " & $DATA0 & " ($DATA0)") write_console_line(" " & StringToBinary($DATA0, $SB_UTF8) & @CRLF) _DB_Store_Wrong_Way($DATA0) write_console_line(@ScriptLineNumber & ": " & $FAKE_DATA_BASE & " ($FAKE_DATA_BASE)" & @CRLF) Local $DATA1 = _DB_GetData_Proper_Way() write_console_line(@ScriptLineNumber & ": " & $DATA1 & " ($DATA1)") write_console_line(" " & StringToBinary($DATA1, $SB_UTF8) & @CRLF) _DB_Store_Proper_Way($DATA1) _DB_Imp() Local $DATA2 = _DB_GetData_Proper_Way() write_console_line(@ScriptLineNumber & ": " & $DATA2 & " ($DATA2)") write_console_line(" " & StringToBinary($DATA2, $SB_UTF8) & @CRLF) _DB_FIX() Local $DATA3 = _DB_GetData_Proper_Way() write_console_line("===== After DB Fix =====" & @CRLF) write_console_line("$DATA0 = " & $DATA0) write_console_line(" = " & StringToBinary($DATA0, $SB_UTF8) & @CRLF) write_console_line("$DATA3 = " & $DATA3) write_console_line(" = " & StringToBinary($DATA0, $SB_UTF8) & @CRLF) write_console_line("($DATA0 == $DATA3) = " & ($DATA0 == $DATA3)) EndFunc Func _DB_Store_Wrong_Way($DATA) $FAKE_DATA_BASE = StringToBinary($DATA) EndFunc ;==>_DB_Store_Wrong_Way Func _DB_GetData_Proper_Way() Return BinaryToString($FAKE_DATA_BASE, $SB_UTF8) EndFunc ;==>_DB_GetData_Proper_Way Func _DB_Store_Proper_Way($DATA) $FAKE_DATA_BASE = StringToBinary($DATA, $SB_UTF8) EndFunc ;==>_DB_Store_Proper_Way Func _DB_Imp() Local $DATA = $FAKE_DATA_BASE $DATA = BinaryToString($DATA) _DB_Store_Proper_Way($DATA) EndFunc ;==>_DB_Imp Func _DB_FIX() Local $sData = _DB_GetData_Proper_Way() ;Since I don't know the language, this is just an example that translates ;the characters back to their original format based on some sort of ;user-defined language-specific rules. $sData = StringReplace($sData, "ul", "uł", 0, True) $sData = StringReplace($sData, "sc", "ść", 0, True) _DB_Store_Proper_Way($sData) EndFunc ;==>_DB_FIX Func write_console_line($sMsg = "") GUICtrlSetData($ctlEditBox, $sMsg & @CRLF, 1) EndFunc Output: 36: Tytuł: to jest treść testowa ($DATA0) 0x54797475C5823A20746F206A65737420747265C59BC48720746573746F7761 41: 0x547974756C3A20746F206A65737420747265736320746573746F7761 ($FAKE_DATA_BASE) 45: Tytul: to jest tresc testowa ($DATA1) 0x547974756C3A20746F206A65737420747265736320746573746F7761 53: Tytul: to jest tresc testowa ($DATA2) 0x547974756C3A20746F206A65737420747265736320746573746F7761 ===== After DB Fix ===== $DATA0 = Tytuł: to jest treść testowa = 0x54797475C5823A20746F206A65737420747265C59BC48720746573746F7761 $DATA3 = Tytuł: to jest treść testowa = 0x54797475C5823A20746F206A65737420747265C59BC48720746573746F7761 ($DATA0 == $DATA3) = True Edited January 10, 2021 by TheXman mLipok 1 CryptoNG UDF: Cryptography API: Next Gen jq UDF: Powerful and Flexible JSON Processor | jqPlayground: An Interactive JSON Processor Xml2Json UDF: Transform XML to JSON | HttpApi UDF: HTTP Server API | Roku Remote: Example Script About Me How To Ask Good Questions On Technical And Scientific Forums (Detailed) | How to Ask Good Technical Questions (Brief) "Any fool can know. The point is to understand." -Albert Einstein "If you think you're a big fish, it's probably because you only swim in small ponds." ~TheXman Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now