Jump to content

WebDriver + UFT8/Emojis help


Recommended Posts

Hi, i've been looking and looking and maybe i'm not that tech savvy, could anyone help, please? 

i'm trying to send some input string into a textarea, and it works great with normal strings, but if there's a Browser Emoji 🚕🛺🚙 - then text is not sent, or empty text arrives, basically input field is left empty. 

        _WD_ElementAction($sSession, $inputTextArea, 'clear')
        _WD_ElementAction($sSession, $inputTextArea, 'value', $sText)

is there anything I can do to the text to preserve the Emojis when I send it to webdriver? i started looking into WinAPIConv.au3 functions, but can't make connection with my situation, maybe i am not familiar with terminology... 

thanks for your help!! 

Edited by sergey_slash
Link to comment
Share on other sites

I'll take a look at this. In the mean time, you can try playing with the BinaryFormat option of _WD_Option, which is basically the value used as the second parameter for StringToBinary.

Edit: Quickly tested and it works fine in Firefox. In Chrome, I got the following error --

__WD_Post: URL=HTTP://127.0.0.1:9515/session/d0ff79cbc0f8de62387082902d571e86/element/a8493eeb-3218-4e0c-8610-a9f677a05ded/value; $sData={"id":"a8493eeb-3218-4e0c-8610-a9f677a05ded", "text":"??????"}
__WD_Post: StatusCode=500; ResponseText={"value":{"error":"unknown error","message":"unknown error: ChromeDriver only supports characters in...
__WD_Post ==> Timeout: {"value":{"error":"unknown error","message":"unknown error: ChromeDriver only supports characters in the BMP\n  (Session info: chrome=81.0.4044.129)","stacktrace":"Backtrace:\n\tOrdinal0 [0x00A05F73+2449267]\n\tOrdinal0 [0x00938361+1606497]\n\tOrdinal0 [0x0082F969+522601]\n\tOrdinal0 [0x007DFF51+196433]\n\tOrdinal0 [0x007C63E8+91112]\n\tOrdinal0 [0x007C5BCE+89038]\n\tOrdinal0 [0x007D9FAD+171949]\n\tOrdinal0 [0x007C39A6+80294]\n\tOrdinal0 [0x007DA1F1+172529]\n\tOrdinal0 [0x007E326C+209516]\n\tOrdinal0 [0x007D9E5B+171611]\n\tOrdinal0 [0x007C1DD8+73176]\n\tOrdinal0 [0x007C2E50+77392]\n\tOrdinal0 [0x007C2DE9+77289]\n\tOrdinal0 [0x0094D8D7+1693911]\n\tGetHandleVerifier [0x00AA4036+522726]\n\tGetHandleVerifier [0x00AA3D74+522020]\n\tGetHandleVerifier [0x00AB9187+609079]\n\tGetHandleVerifier [0x00AA48A6+524886]\n\tOrdinal0 [0x00945CBC+1662140]\n\tOrdinal0 [0x0094F23B+1700411]\n\tOrdinal0 [0x0094F3A3+1700771]\n\tOrdinal0 [0x00965215+1790485]\n\tBaseThreadInitThunk [0x76346359+25]\n\tRtlGetAppContainerNamedObjectPath [0x77757C14+228]\n\tRtlGetAppContainerNamedObjectPath [0x77757BE4+180]\n"}}
_WD_ElementAction: {"value":{"error":"unknown error","message":"unknown error: ChromeDriver only supports characters in...
_WD_ElementAction ==> Timeout: {"value":{"error":"unknown error","message":"unknown error: ChromeDriver only supports characters in the BMP\n  (Session info: chrome=81.0.4044.129)","stacktrace":"Backtrace:\n\tOrdinal0 [0x00A05F73+2449267]\n\tOrdinal0 [0x00938361+1606497]\n\tOrdinal0 [0x0082F969+522601]\n\tOrdinal0 [0x007DFF51+196433]\n\tOrdinal0 [0x007C63E8+91112]\n\tOrdinal0 [0x007C5BCE+89038]\n\tOrdinal0 [0x007D9FAD+171949]\n\tOrdinal0 [0x007C39A6+80294]\n\tOrdinal0 [0x007DA1F1+172529]\n\tOrdinal0 [0x007E326C+209516]\n\tOrdinal0 [0x007D9E5B+171611]\n\tOrdinal0 [0x007C1DD8+73176]\n\tOrdinal0 [0x007C2E50+77392]\n\tOrdinal0 [0x007C2DE9+77289]\n\tOrdinal0 [0x0094D8D7+1693911]\n\tGetHandleVerifier [0x00AA4036+522726]\n\tGetHandleVerifier [0x00AA3D74+522020]\n\tGetHandleVerifier [0x00AB9187+609079]\n\tGetHandleVerifier [0x00AA48A6+524886]\n\tOrdinal0 [0x00945CBC+1662140]\n\tOrdinal0 [0x0094F23B+1700411]\n\tOrdinal0 [0x0094F3A3+1700771]\n\tOrdinal0 [0x00965215+1790485]\n\tBaseThreadInitThunk [0x76346359+25]\n\tRtlGetAppContainerNamedObjectPath [0x77757C14+228]\n\tRtlGetAppContainerNamedObjectPath [0x77757BE4+180]\n"}}

 

Edited by Danp2
Link to comment
Share on other sites

thanks for this solution, i am using chrome. so i just need to get firefox's webdriver and change to use it in the script, and it should just work? strange that chrome doesn't support that.. 

i guess, i could have debugged this myself, i just assumed this was due to something in AutoIt; while the real reason is Chrome WebDriver and looking in logs would have helped.. thanks for looking into it, @Danp2! 

i already translated 8 of my sites to 24 languages each... usually this took me 3-4hrs per site, now the process takes 40 minutes + 5 minutes to check and make sure all is translated and stuff. sometimes i have to fix 2-3 different translations that didn't get there in time, and all is good... before i would fall asleep 2-3 times while doing all this copy-pasting, cause it's sooooooo boring.. so 1 job would go on for 8-12hrs instead of 3.. now i just watch it happen.... on 2 computers simultaneously.. 😃 thanks, friends, for making autoit! 😃 and for all the help... 

Edited by sergey_slash
Link to comment
Share on other sites

1 hour ago, sergey_slash said:

As far as i've read, when AutoIt stores some UTF-8 value, it is converted to UTF-16, or something, right?

You are not wrong. The more accurate description would be that AutoIt reads the text from the script  (.au3 file) in the encoded format (usually UTF-8) and then internally stores it in UTF-16.

1 hour ago, sergey_slash said:

Then when it's sent to webdriver, it might not be UTF-8 anymore... That is probably relevant to your suspicion...

Right, the data would be intact until the WebDriver recieves the string, this is where problem occurs if the proper encoding is not used. The issue could be in the WebDriver and/or the UDF.

---

@Danp2 Upon further investigation of your output, it looks like it is an bug in ChromeDriver: https://stackoverflow.com/a/59139690

Quote

ChromeDriver only supports characters in the BMP is a known issue with Chromium team as ChromeDriver still doesn't support characters with a Unicode after FFFF. Hence it is impossible to send any character beyond FFFF via ChromeDriver. As a result any attempt to send SMP characters (e.g. CJK, Emojis, Symbols, etc) raises the error.

 

EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time)

DcodingTheWeb Forum - Follow for updates and Join for discussion

Link to comment
Share on other sites

1 minute ago, sergey_slash said:

so i just need to get firefox's webdriver and change to use it in the script, and it should just work?

Yes, you can use Firefox and GeckoDriver (Firefox's webdriver) to send emoji input to the target website.

EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time)

DcodingTheWeb Forum - Follow for updates and Join for discussion

Link to comment
Share on other sites

after trying this and that, i see that for some reason, without any modifications, FF + Gecko WebDriver cannot keep up with my script compared to Chrome.  i don't know why, after doing 2-3 rounds of 24 proper translations from google and deepl, it starts to hang while WaitElement is waiting for the menu to open and button to show up, and eventually times out and basically starts skipping actions (translations)... 

i am also for some reason unable to extract Emoji from text in my script. is it probably because of the inner type conversion? i am very dumb about regular expressions,  so i can't figure out how to detect Emojis.. i thought maybe extract them, send text to translate, then add them back in the script, but whatever reg expressions I try for the detection of Emojis - none of them do anything... 

in any case, since I am working with my own php sources to translate my own websites, i can do what i want right at the source, so my workaround is basically either getting rid of emojis or replacing them with predefined constants, so only random variable names are sent (inside text) for translation - and then return back untranslated. stupid google changes $NAME into $ NAME, but it's OK, i fix those little things...  

Link to comment
Share on other sites

11 hours ago, sergey_slash said:

i am also for some reason unable to extract Emoji from text in my script. is it probably because of the inner type conversion?

It depends on the contents of the HTML element you are extracting, innerText will usually get all text as it appears, so if the element does contain emoji it will return them. You can check the HTML to see if they are doing something like replacing emoji with their own images... usually there is another element with the original text, and you can target that to get the full unaltered text.

If unsure, you can post the HTML of the page from which you want to extract the text and we can take a look. Or just post the URL if the website is public and you don't mind sharing.

EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time)

DcodingTheWeb Forum - Follow for updates and Join for discussion

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...