coderusa Posted November 28, 2023 Share Posted November 28, 2023 (edited) I'm working on a IRC bot script. It works great, and has over 3000 lines of code so I won't post the entire script yet, but maybe this takes a simple answer. I've searched all over and can't seem to find the exact documentation I need. The Issue: NOTE: I'm almost positive it is NOT other user's settings and fonts being the cause, read below. For some people, when the IRC bot sends a message to the chat, it appears like this: When it should appear like this: The group of people helping me to test this bot have mixed results. Some users, it looks normal on their screens, and others it looks like square/?s <?>. On my screen it appears normally, but if one of the users who see's the <?> characters copies and pastes the message into the chat, it also appears with the <?> on my screen (pictured below) Also if I copy the string "-.,¸¸.-·°'`'°·-.,¸¸.-·°'`'°· \_O< QUACK" directly from the script and paste it in chat, it appears normally to everyone, leading me to believe the issue is somewhere in autoit. Here in the below spoiler is the main function where this message originates in the script: Spoiler ; #FUNCTION# ====================================================================================================================== ; Name............: SpawnDuck ; Description.....: Spawns a duck if $MaxDucks is below threshold ; Syntax..........: SpawnDuck($socket, $channel) ; Parameters......: $socket - main socket identifier ; $channel - channel to spawn duck in ; Return Values...: Returns 1 - Successful spawn ; Returns -1 - Maximum number of ducks already spawned ; Author..........: Neo_ (aka coderusa) ; Modified........: ; ================================================================================================================================= Func SpawnDuck($socket, $channel) For $d = 1 To $MaxDucks Step 1 If Eval("Duck" & $d) = "" Then If IsDuck() = 0 Then Global $ScareFactor = 0 EndIf If $IsGolden = False Then $GoldDuck = $GoldenDuck / 100 If Random(0,1) < $GoldDuck Then Global $IsGolden = "Duck" & $d Assign("Duck" & $d, TimerInit(), 2) IRC_Send_PrivMsg($socket, $channel, "-.,¸¸.-·°'`'°·-.,¸¸.-·°'`'°· \_O< QUACK * GOLDEN DUCK *") Return 1 EndIf EndIf Assign("Duck" & $d, TimerInit(), 2) IRC_Send_PrivMsg($socket, $channel, "-.,¸¸.-·°'`'°·-.,¸¸.-·°'`'°· \_O< QUACK") Return 1 EndIf Next Return -1 EndFunc ;===> SpawnDuck Here in the below spoiler, also is the main function that sends the message to the IRC server. Spoiler ; #FUNCTION# ==================================================================================================================== ; Name............: IRC_Send_PrivMsg ; Description.....: Send PRIVMSG ; Syntax..........: IRC_Send_PrivMsg($socket, $target, $msg) ; Parameters......: $socket - main socket identifier ; $target - #Channel or Username ; $message - PRIVMSG text ; Return Values...: Failure: -1 ; Success: 1 ; Author..........: Neo_ (aka coderusa) ; Modified........: ; ================================================================================================================================ Func IRC_Send_PrivMsg($socket, $target, $msg) TCPSend($socket, "PRIVMSG " & $target & " :" & $msg & @CRLF) If @error Then Cons("Server connection lost.") Return -1 EndIf Return 1 EndFunc ;===> IRC_Send_PrivMsg From what I have found in research, you can change the UTF-8 settings with wrapper directives, but cannot find the exact way to change it... Help file says default is UTF-8 without BOM but can be changed to UTF-8 with BOM and other options. My main question is, what is the wrapper directive, or how do I change this from UTF-8 without BOM to UTF-8 with BOM? Or is my issue something completely different? I'm stumped... Edited November 28, 2023 by coderusa Link to comment Share on other sites More sharing options...
water Posted November 28, 2023 Share Posted November 28, 2023 I had a similar issue when the text in a MsgBox containing german Umlauts (öäü etc.) was inccorrectly displayed. I solved the problem with setting the Encoding in SciTE to UTF-8 with BOM. In SciTE use: File > Encoding > UTF-8 with BOM My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
coderusa Posted November 28, 2023 Author Share Posted November 28, 2023 (edited) 23 minutes ago, water said: I had a similar issue when the text in a MsgBox containing german Umlauts (öäü etc.) was inccorrectly displayed. I solved the problem with setting the Encoding in SciTE to UTF-8 with BOM. In SciTE use: File > Encoding > UTF-8 with BOM Thanks. Just tried that, and testers report no change 😕 Edited November 28, 2023 by coderusa Link to comment Share on other sites More sharing options...
water Posted November 28, 2023 Share Posted November 28, 2023 Your screenshot looks like an output to the DOS console. I needed to use the following function to write the Umlauts to the Console: ; Taken from: https://www.autoitscript.com/forum/topic/201398-german-umlauts-and-console-programs/ Func Ansi2Oem($text) $text = DllCall('user32.dll', 'Int', 'CharToOem', 'str', $text, 'str', '') Return $text[2] EndFunc ;==>Ansi2Oem My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
coderusa Posted November 28, 2023 Author Share Posted November 28, 2023 4 minutes ago, water said: Your screenshot looks like an output to the DOS console. I needed to use the following function to write the Umlauts to the Console: ; Taken from: https://www.autoitscript.com/forum/topic/201398-german-umlauts-and-console-programs/ Func Ansi2Oem($text) $text = DllCall('user32.dll', 'Int', 'CharToOem', 'str', $text, 'str', '') Return $text[2] EndFunc ;==>Ansi2Oem I will try this. Its a IRC client, I am using mIRC myself, the testers are using mIRC, Adiirc, HexChat and I think IRSSI. Link to comment Share on other sites More sharing options...
coderusa Posted November 28, 2023 Author Share Posted November 28, 2023 @water the Ansi2Oem makes it worse LOL Link to comment Share on other sites More sharing options...
coderusa Posted November 29, 2023 Author Share Posted November 29, 2023 @TheDcoder have you ever come across anything like this with any of your IRC scripts? Link to comment Share on other sites More sharing options...
TheDcoder Posted November 29, 2023 Share Posted November 29, 2023 Indeed I have, I couldn't send emoji with my scripts. I suspect that TCPSend is screwing with the encoding. You will have to manually find a way to convert your messages string to properly formatted UTF-8 binary and then send that directly with TCPSend. By the way, don't bother with BOM, it is fairly useless and even undesired, so it won't help you with anything but it can for sure screw up your text potentially if programs are not properly made to handle it. AutoIt's own IniRead function will fail if the file has a BOM. EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time) DcodingTheWeb Forum - Follow for updates and Join for discussion Link to comment Share on other sites More sharing options...
Nine Posted November 29, 2023 Share Posted November 29, 2023 As per help file of TCPSend : Remarks If Unicode strings need to be transmitted they must be encoded/decoded with StringToBinary()/BinaryToString(). TheDcoder and coderusa 1 1 “They did not know it was impossible, so they did it” ― Mark Twain Spoiler Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Screen Scraping Multi-Threading Made Easy Link to comment Share on other sites More sharing options...
TheDcoder Posted November 29, 2023 Share Posted November 29, 2023 I totally forgot about those functions, very handy indeed! But why doesn't TCPSend automatically handle that though? Is it just hardcoded to convert everything to ANSI? coderusa 1 EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time) DcodingTheWeb Forum - Follow for updates and Join for discussion Link to comment Share on other sites More sharing options...
coderusa Posted December 1, 2023 Author Share Posted December 1, 2023 (edited) On 11/29/2023 at 5:33 AM, Nine said: As per help file of TCPSend : Remarks If Unicode strings need to be transmitted they must be encoded/decoded with StringToBinary()/BinaryToString(). As @TheDcoder said, I completely forgot about those functions! I'll implement this and see what comes of it. Thanks! On 11/29/2023 at 7:32 AM, TheDcoder said: I totally forgot about those functions, very handy indeed! But why doesn't TCPSend automatically handle that though? Is it just hardcoded to convert everything to ANSI? I wonder this too. Python is similar and needs UTF-8 encoding (at least for IRC stuff send thru sockets). I think (but could be wrong) that its because some things need to be ANSI to work properly. Edited December 1, 2023 by coderusa TheDcoder 1 Link to comment Share on other sites More sharing options...
rudi Posted December 4, 2023 Share Posted December 4, 2023 (edited) On 11/29/2023 at 12:29 AM, water said: Your screenshot looks like an output to the DOS console. I needed to use the following function to write the Umlauts to the Console: ; Taken from: https://www.autoitscript.com/forum/topic/201398-german-umlauts-and-console-programs/ Func Ansi2Oem($text) $text = DllCall('user32.dll', 'Int', 'CharToOem', 'str', $text, 'str', '') Return $text[2] EndFunc ;==>Ansi2Oem @water with the encoding for CMD batch files I have to use time by time TXT files created by an autoit script. With this sample script I think it's quite interesting to compare the results inside a CMD box, especially the different file sizes. The smallest size and correctly displayed characters is is fact for $FO_ANSI = 512 with your function: expandcollapse popup$text="Umlauts, e and szlig: äöüÄÖÜéèß" $hDisplay=FileOpen("C:\temp\display.cmd",2+8) FileWriteLine($hDisplay,"@echo off") Dim $aEnc[9]=[0,32,64,128,256,512,1024,2048,16384] ; encoding, see help file for fileopen() #cs $FO_UNICODE or $FO_UTF16_LE (32) = Use Unicode UTF16 Little Endian reading and writing mode. $FO_UTF16_BE (64) = Use Unicode UTF16 Big Endian reading and writing mode. $FO_UTF8 (128) = Use Unicode UTF8 (with BOM) reading and writing mode. $FO_UTF8_NOBOM (256) = Use Unicode UTF8 (without BOM) reading and writing mode. $FO_ANSI (512) = Use ANSI reading and writing mode. $FO_UTF16_LE_NOBOM (1024) = Use Unicode UTF16 Little Endian (without BOM) reading and writing mode. $FO_UTF16_BE_NOBOM (2048) = Use Unicode UTF16 Big Endian (without BOM) reading and writing mode. $FO_FULLFILE_DETECT (16384) = When opening for reading and no BOM is present, use the entire file to determine if it is UTF8 or UTF16. If this is not used then only the initial part of the file (up to 64KB) is checked for performance reasons. #ce for $i in $aEnc $Out="C:\temp\test-" & $i & ".txt" $hOut=FileOpen($Out,2+8+$i) FileWriteLine($hOut,"") FileWriteLine($hOut,"Encoding Number = " & $i) FileWriteLine($hOut,"Text directly : " & $text) FileWriteLine($hOut,"Text converted: " & Ansi2Oem($text)) FileClose($hOut) FileWriteLine($hDisplay,"type " & $Out) FileWriteLine($hDisplay,"dir " & $Out) FileWriteLine($hDisplay,"echo -----------------") Next FileWriteLine($hDisplay,"@echo on") FileClose($hDisplay) ConsoleWrite("To display results and file sizes open a CMD box and enter this command:" & @CRLF) ConsoleWrite( & @CRLF) Func Ansi2Oem($text) $text = DllCall('user32.dll', 'Int', 'CharToOem', 'str', $text, 'str', '') Return $text[2] EndFunc ;==>Ansi2Oem Edited December 4, 2023 by rudi Earth is flat, pigs can fly, and Nuclear Power is SAFE! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now