Briksins Posted July 22, 2014 Posted July 22, 2014 Hello I'm writing here automation test script for our application and facing difficulties with special Irish characters: "á", "Á", "é", "É", "í", "Í", "ó", "Ó", "ú", "Ú" unfortunately editor doesn't allow me to define them in the code, so I decide to use ascii numbers to define characters i need Over her I found all ASCII codes i need and in this instruction I found how to use it So for example if we talking about character "á" which code is 225 according to the table I'm doing it like that: $irishCharA = Chr(225) $path = "some irish sentence with speci" & $irishCharA & "l character" So i am expecting path value to be some irish sentence with speciál character however i'm not getting it. I had to debug the code and figure out that it is not "á" character, but unrecognised diamond shape with question mark Then I tried doing: $path = "some irish sentence with speci{Asc 225}l character" but it produce the literal string with braces. Finally I decide to use file. I store all the values as key value pair and save the file with UTF8 encoding, here is content a=á A=Á e=é E=É i=í I=Í o=ó O=Ó u=ú U=Ú Then i start reading it as UTF8 with BOM (128 stand for UTF according to this manual ) Func readConfigFromPath($path) $openedFile = FileOpen($path, 128) Dim $guiCfg[0][0] If $openedFile Then Local $line = "" $lineCounter = 1 Do $line = FileReadLine($openedFile, $lineCounter) If StringInStr($line, "=") Then $props = StringSplit($line, "=") ReDim $guiCfg[UBound($guiCfg) + 1][2] $guiCfg[UBound($guiCfg) - 1][0] = $props[1] $guiCfg[UBound($guiCfg) - 1][1] = $props[2] EndIf $lineCounter = $lineCounter +1 Until $line = "" FileClose($path) EndIf Return $guiCfg EndFunc however that read each single special character as normal character such that irish "á" become simple "a" Irish "ú" become simple "u" so my key value pair looks like: key = value: key "a" and value also "a", however should be "á" How do i get AutoIt accept non-standard characters?
Solution jchd Posted July 22, 2014 Solution Posted July 22, 2014 (edited) Convert your script encoding and console output to UTF8. For non-ANSI Unicode characters to display properly in Scite console, AFAIK you need to switch to UTF8 in SciTEUser.properties: add these lines: code.page=65001 output.code.page=65001 Then this works: ConsoleWrite("ÀÐØÞßƵɄʬЖЗДلنرشحჱჶẶỘ⅝⇈≽⋛⍓⍫♬✠⬔" & @LF) ; display ANSI-fied characters, mostly ? _ConsoleWrite("ÀÐØÞßƵɄʬЖЗДلنرشحჱჶẶỘ⅝⇈≽⋛⍓⍫♬✠⬔" & @LF) ; works fine Func _ConsoleWrite($s) ConsoleWrite(BinaryToString(StringToBinary($s, 4), 1)) EndFunc You don't need any special setting for Windows controls: MsgBox(0, "", "ÀÐØÞßƵɄʬЖЗДلنرشحჱჶẶỘ⅝⇈≽⋛⍓⍫♬✠⬔") Edited July 22, 2014 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
rodent1 Posted July 22, 2014 Posted July 22, 2014 (edited) Make sure you have the right font. Not all fonts have all characters, and some fonts are used to display other kinds of symbols etc. If the font is wingdings, you will get some kind of up arrow for chr(225). Use for example guictrlsetfont to first select the font you want. To help you pick the font, if you open Notepad and select a font, you can keep the alt key pressed while you press "225" on the numeric keypad (make sure num lock is on), and depending on the font you may get some squiggle or "á". Hmm, it looks like jchd answered at the same time. Edited July 22, 2014 by rodent1
Briksins Posted July 23, 2014 Author Posted July 23, 2014 (edited) Thank you guys, that helped. Didn't expect that i need to change encoding of script itself Also I had Russian language installed in OS as 2nd language, so character 225 was changed from "á" to russian "б", so I had to move my dev environment to new and clean VM. Thank you one more time Edited July 23, 2014 by Briksins
jchd Posted July 23, 2014 Posted July 23, 2014 AutoIt native strings are Unicode and with UTF8 script source you don't have to do anything else (except transform strings to UTF8 for SciTE console output as I said). You can have any language setting in Windows independantly: that only affect which ANSI codepage is used, but you're only concerned with Unicode so this doesn't affect your scripts. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
leojarrabi Posted July 25, 2014 Posted July 25, 2014 (edited) I am having a problem writing/creating a log file that contains non-ANSI characters. I used fileopen with $FO_UTF8_NOBOM parameter and passed it to FileWriteLine as suggested in the manual (https://www.autoitscript.com/autoit3/docs/intro/unicode.htm) Here's my code: #include <File.au3> Local $hFileOpen = FileOpen($CmdLine[1], $FO_UTF8_NOBOM) Local $sFileContent = FileRead($hFileOpen) $LogReport = $CmdLine[2] If _FileCreate($LogReport) = 0 Then MsgBox(0, 'Permission Denied', 'Could not create log file') Exit EndIf Local $hLogFileHandle = FileOpen($LogReport, $FO_UTF8_NOBOM) ; list of valid characters to be used in Regular Expression Pattern $AccentedChars = "âãäåæçèéêëìíîïàñòóôõöøùúûüýÿāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥħĨĩĪīĬĭ" $PunctuationMarks = "“”‘’'"";&_:–,\.\?\!" $Braces = "\(\)\[\]<>" $MathOperators = "/\-=$" $OtherValidCharacters = "©" ; Look for non-ANSI character that is not on the list above $asResult = StringRegExp($sFileContent, '([^0-9A-Za-z\s' & $AccentedChars & $PunctuationMarks & $Braces & $MathOperators & $OtherValidCharacters & '])++', 3) $ListOfInvalidChars = "" For $i = 0 to UBound($asResult)-1 If StringInStr($ListOfInvalidChars, $asResult[$i]) = 0 Then $ListOfInvalidChars = $ListOfInvalidChars & $asResult[$i] Next if $ListOfInvalidChars <> "" Then ;write the list of characters found into the log file, hopefully as UTF8 so I could actually see what the character was and not just a ? character MsgBox(0, "Found", $ErrorMessage) FileWriteLine($hLogFileHandle, $ErrorMessage) Else MsgBox(0, "Result", "No error") Endif FileClose($hLogFileHandle) On the test file I am using, I placed a non-ANSI character. I know that it was able to detect that because of the Msgbox before writing it to the log file. When the script is finished, the log file contains nothing. I though it would be the Autoit version. I am using 3.3.10.2. Unicode support starts on version 3.2.4.0 so my version should be fine, right? Any help would be appreciated. Edited July 25, 2014 by leojarrabi
AdmiralAlkex Posted July 25, 2014 Posted July 25, 2014 When the script is finished, the log file contains nothing.You need to set either of the write modes too or you will only open the file in read mode. .Some of my scripts: ShiftER, Codec-Control, Resolution switcher for HTC ShiftSome of my UDFs: SDL UDF, SetDefaultDllDirectories, Converting GDI+ Bitmap/Image to SDL Surface
jchd Posted July 25, 2014 Posted July 25, 2014 I'd advise against no-BOM files, unless you're sure that the application(s) reading them are aware of the encoding. In the general case BOM files are more robust to encoding misinterpretation, but not all will accept BOMs (and Unicode is as old as 1991!). This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
leojarrabi Posted July 25, 2014 Posted July 25, 2014 You need to set either of the write modes too or you will only open the file in read mode. $FO_UTF8_NOBOM (256) = Use Unicode UTF8 (without BOM) reading and writing mode. I assume it means that the file is in write mode? Or am I wrong? So how do I open the file in UTF8 mode so I could write to it as well in UTF8 mode?
leojarrabi Posted July 25, 2014 Posted July 25, 2014 (edited) I'd advise against no-BOM files, unless you're sure that the application(s) reading them are aware of the encoding. In the general case BOM files are more robust to encoding misinterpretation, but not all will accept BOMs (and Unicode is as old as 1991!). I will also use FileGetEncoding function to make sure that it will return 256 before it actually process the file. But for now, I just need to list the characters that are not listed in my "valid list". Edited July 25, 2014 by leojarrabi
leojarrabi Posted July 27, 2014 Posted July 27, 2014 I still haven't figure out why i can't write on the file in utf8 mode. When i open the file in append mode, the program could write on the file but the characters are inly showing as question mark. Is this a bug in autoit?
jchd Posted July 27, 2014 Posted July 27, 2014 Read the help of FileOpen about its flags. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)
leojarrabi Posted July 28, 2014 Posted July 28, 2014 Read the help of FileOpen about its flags. I see, it can be declared as a combination. I should have declared it as Local $hLogFileinUTF8 = FileOpen($LogReport, $FO_APPEND + $FO_UTF8_NOBOM) Thanks!
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now