Jump to content

Convert UTF-8 in a windows valid path - [SOLVED]


 Share

Recommended Posts

I have some paths in an sql database utf-8 encoded.
I need to convert in a windows correct path. Some dir and file names have japanese or arabic characters and/or symbols.

How can i do? i've tried a lot of solutions but no success

By example i need to convert this (utf-8): ââââ (In my db field chars are different)

 in this: ★★★

in php utf8_encode() and utf8_decode() makes the job, but in autoit?

Thanks in advance
 

Link to comment
Share on other sites

  • Developers

Moved to the appropriate forum, as the Developer General Discussion forum very clearly states:

Quote

General development and scripting discussions. If it's super geeky and you don't know where to put it - it's probably here.


Do not create AutoIt-related topics here, use the AutoIt General Help and Support or AutoIt Technical Discussion forums.

Moderation Team

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

Func _CodepageToString($sCP, $iCodepage = Default)
    If $iCodepage = Default Then $iCodepage = 65001
    Local $tText = DllStructCreate("byte[" & StringLen($sCP) & "]")
    DllStructSetData($tText, 1, $sCP)
    Local $aResult = DllCall("kernel32.dll", "int", "MultiByteToWideChar", "uint", $iCodepage, "dword", 0, "struct*", $tText, "int", StringLen($sCP), _
            "ptr", 0, "int", 0)
    Local $tWstr = DllStructCreate("wchar[" & $aResult[0] & "]")
    $aResult = DllCall("kernel32.dll", "int", "MultiByteToWideChar", "uint", $iCodepage, "dword", 0, "struct*", $tText, "int", StringLen($sCP), _
            "struct*", $tWstr, "int", $aResult[0])
    Return DllStructGetData($tWstr, 1)
EndFunc   ;==>_CodepageToString

Func _StringToCodepage($sStr, $iCodepage = Default)
    If $iCodepage = Default Then $iCodepage = 65001
    Local $aResult = DllCall("kernel32.dll", "int", "WideCharToMultiByte", "uint", $iCodepage, "dword", 0, "wstr", $sStr, "int", StringLen($sStr), _
            "ptr", 0, "int", 0, "ptr", 0, "ptr", 0)
    Local $tCP = DllStructCreate("char[" & $aResult[0] & "]")
    $aResult = DllCall("Kernel32.dll", "int", "WideCharToMultiByte", "uint", $iCodepage, "dword", 0, "wstr", $sStr, "int", StringLen($sStr), _
            "struct*", $tCP, "int", $aResult[0], "ptr", 0, "ptr", 0)
    Return DllStructGetData($tCP, 1)
EndFunc   ;==>_StringToCodepage

Here you are.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Hi  Jchd, Thank you for your quick reply.
Unfortunately, even this function won't return the correct value.

ConsoleWrite (_CodepageToString(_StringToCodepage("★")))

The result is:  ?

I lost a couple of days reading your posts before ask here, and trying out functions like MultibyteToWideChar () and others.
The workaround could be to pass the correct string from PHP to autoit, cause the PHP function utf8_decode () function does the job correctly.
But I would like to do it in autoit.  So I could read the db directly from autoit and make same loops job with it.
Any idea?

Link to comment
Share on other sites

The Scite console hasto be switched to UTF8 in order to display UTF8 properly. So use MsgBox to display Unicode easily.

Which DB engine do you use which doesn't have UTF16 output?

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

I have already activated Scite on utf8.

I am using mysql 5.6. that supports UTF16, but I've read everywhere it's better to have a UTF8mb4 character set to support all languages.
So I changed text fields collation on utf8mb4_general_ci.
Since I activated utf8_general_ci I have no longer any errors, inserting filenames.

Link to comment
Share on other sites

All depends on how you retrieve the UTF8 data, I mean which UDF and functions you use to transfer data to/from the engine. Remember that unless precautions are taken, AutoIt strings are UCS2 (a subset of UTF16LE limited to the Unicode BMP).

MySQL's utf8 is a buggy and terribly bad implementation of Unicode UTF8, which MySQL now calls utf8mb4 (why make it simple?).

AutoIt doesn't handle UTF8 by itself (AutoIt strings are UCS2) but it's possible to retrieve UTF8 data and convert it to UCS2. UTF16-LE codepoints above the BMP (that is > 0xFFFF) can be stored but aren't dealt with perfectly since an UCS2 codepoint is a single 16-bit encoding unit.

I've no experience with MySQL, only SQLite and some other RDBMS, but the encoding used to transfer data to/from the engine can be distinct from the encoding used by the storage engine.

Note that there is a very good and stable ADO UDF available by @mLipok.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

  • 2 weeks later...

Finally solved!! 😀
Your function works great.

 

I just had to set the db in utf8 before doing the query.
It took me one week to understand it.

so:

$mysql= _MySQL_Real_Query($MysqlConn, "Set NAMES 'utf8mb4'");
$mysql_bool = _MySQL_Real_Query($MysqlConn, $query)
$res = _MySQL_Store_Result($MysqlConn)
$row = _MySQL_Fetch_Row_StringArray($res)

$path=_CodepageToString($row[2]);

Thanks again Jchd !

Link to comment
Share on other sites

  • bi5bo changed the title to Convert UTF-8 in a windows valid path - [SOLVED]

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...