Here is a script which removes diacritic marks. It appears to work, but I'm not entirely sure what is happening. There was a certain amount of trial and error involved in writing this, so I would like someone with more experience to look at it. It feels like a clumsy technique I'm using. Diacritics should be removed from both extended ansi and from unicode. I couldn't find a way to do this without calling _WinAPI_MultiByteToWideChar twice.
#include <WinAPI.au3>
#region - Example
Local $sTestString = ""
;For $i = 192 To 255
; $sTestString &= Chr($i) ; extended latin from Windows 1252 code page
;Next
For $i = 256 To 382
$sTestString &= ChrW($i) ; Extended latin alpha characters
Next
Local $newString = _StripDiacriticMarks($sTestString)
MsgBox(0, "", $sTestString & @LF & @LF & $newString)
#endregion
Func _StripDiacriticMarks($sText)
If Not IsString($sText) Then Return SetError(1, 0, $sText)
Local $sCurrChar, $sSplitChar, $sElement, $sNewString = ""
For $i = 1 To StringLen($sText)
$sCurrChar = StringMid($sText, $i, 1)
$sSplitChar = _WinAPI_MultiByteToWideChar($sCurrChar, 3, $MB_COMPOSITE)
$sElement = DllStructGetData($sSplitChar, 1)
If StringIsAlpha($sElement) Then
$sCurrChar = $sElement
ElseIf DllStructGetSize($sSplitChar) > 4 Then
$sSplitChar = _WinAPI_MultiByteToWideChar($sCurrChar, 3, $MB_COMPOSITE, True)
For $j = 1 To Stringlen($sSplitChar)
$sElement = StringMid($sSplitChar, $j, 1)
If StringIsAlpha($sElement) Then
$sCurrChar = $sElement
ExitLoop
EndIf
Next
EndIf
$sNewString &= $sCurrChar
Next
Return $sNewString
EndFunc
I'm sure I don't need all this code.