_StringProper and Apostrophes

I have been using the function to capitalise lots of strings and have the issue where a word has an apostrophe to mark possession or contract words. I only noticed recently after browsing said files that the letter following the  '  was uppercase i.e.  Autoit'S.

I don't know if there is a reason for this, maybe for different languages or just an oversight but the function can be altered to stop this . I don't know which would be more encompassing using Asc($sChr) <> 39 or  $sChr <> "'" but went with the Asc

Func _StringProperNew($sString)
    Local $bCapNext = True, $sChr = "", $sReturn = ""
    Local $sPattern = '[a-zA-ZÀ-ÿšœžŸ]' ; Pattern for letter characters

    For $i = 1 To StringLen($sString)
        $sChr = StringMid($sString, $i, 1)
            Case $bCapNext = True
                If StringRegExp($sChr, $sPattern) Then
                    $sChr = StringUpper($sChr)
                    $bCapNext = False
            Case Not StringRegExp($sChr, $sPattern)
                ; If $sChr <> "'" Then
                If Asc($sChr) <> 39 Then $bCapNext = True ; U+0027 is 39 in ASCII
            Case Else
                $sChr = StringLower($sChr)
        $sReturn &= $sChr

    Return $sReturn
EndFunc   ;==>_StringProper

Does anyone have any insight if this is by design and I missed something?

12 minutes ago, benners said:

I have been using the function to capitalise lots of strings and have the issue where a word has an apostrophe to mark possession or contract words. I only noticed recently after browsing said files that the letter following the  '  was uppercase i.e.  Autoit'S.

Same behavior of PROPER function in Excel. I never used this function but it's good to know about this behavior.

Proper is a very (VERY) complex function is you insist at making it solid. In fact, almost impossible to code correctly. This page only scratches the surface of the issues (read to the end)!


Nevertheless one can achieve a mostly satisfying result efficiently in many use cases this way:

Local $a = [ _
    "Greengrocers’ apostrophes", _
    "St James’s Park", _
    "ladies’ hats", _
    "Le Cléac'h", _
    "O'Malley", _
    "80's music", _
    "Ho, mia kor’! Post longa laborado", _
    "John's shoes", _
    "doesn't mean anything ƨƳƭƫƼ'Ɖƃ sorry it's my fault", _
    "Διεθνής εβδομάδα χειμερ’ινών αγώνων", _
    "Тиждень зимо՚вих видів спорту (насправді 11 днів)", _
    " ഇതിൽ അഞ്ചു വളയങ്ങൾ'ആലേഖനം ", _
    "μζΣΨϑ'ʤʞʫʀ" _

For $i = 0 To UBound($a) - 1
    $a[$i] = _String_Proper($a[$i])


Func _String_Proper($s)
    Local Static $sPattern = "(*UCP)\b((?<!['՚‘’“”ʼʾ׳״])[[:lower:]])"
    Return Execute('"' & StringRegExpReplace(StringLower($s), $sPattern, '" & StringUpper("$1") & "') & '"')
EndFunc   ;==>_String_Proper

Depending on the script (language) of the input text, you may have to adapt the set of apostrophe-like Unicode signs. Yet never expect a perfect universal result using such a simple approach. Even in "simple" english, O'Malley gives O'malley, so even there things are not that simple.

#include <Array.au3>
#include <String.au3>

Local $a = [ _
    "Greengrocers’ apostrophes", _
    "St James’S Park", _
    "ladies’ hats", _
    "Le Cléac'H", _
    "O'Malley", _
    "80's music", _
    "Ho, mia kor’! Post longa laborado", _
    "John's shoes", _
    "doesn'T mean anything ƨƳƭƫƼ'Ɖƃ sorry it's my fault", _
    "Διεθνής εβδομάδα χειμερ’ινών αγώνων", _
    "Тиждень зимо՚вих видів спорту (насправді 11 днів)", _
    " ഇതിൽ അഞ്ചു വളയങ്ങൾ'ആലേഖനം ", _
    "μζΣΨϑ'ʤʞʫʀ" _

Local $hTimer = TimerInit()
For $i = 0 To UBound($a) - 1
    $a[$i] = _StringProper_jpm($a[$i])
Local $iDiff = TimerDiff($hTimer)
ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $iDiff = ' & $iDiff & @CRLF & '>Error code: ' & @error & '    Extended code: ' & @extended & ' (0x' & Hex(@extended) & ')' & @CRLF) ;### Debug Console


Func _StringProper_jchd($s)
    Local Static $sPattern = "(*UCP)\b((?<!['՚‘’“”ʼʾ׳״])[[:lower:]])"
    Return Execute('"' & StringRegExpReplace(StringLower($s), $sPattern, '" & StringUpper("$1") & "') & '"')
EndFunc   ;==>_StringProper_jchd

Func _StringProper_jpm($sString)
    Local $bCapNext = True, $sChr = "", $sReturn = ""
    Local Static $sPattern = '[a-zA-ZÀ-ÿšœžŸ]'
    Local $iStrLen = StringLen($sString)
    For $i = 1 To $iStrLen
        $sChr = StringMid($sString, $i, 1)
            Case $bCapNext = True
                If StringRegExp($sChr, $sPattern) Then
                    If $i <> $iStrLen And StringMid($sString, $i + 1, 1) <> " " Then
                        $sChr = StringUpper($sChr)
                        $sChr = StringLower($sChr)
                    $bCapNext = False
            Case Not StringRegExp($sChr, $sPattern)
                $bCapNext = True
            Case Else
                $sChr = StringLower($sChr)
        $sReturn &= $sChr
    Return $sReturn
EndFunc   ;==>_StringProper_jpm

Unless you have a regexp for O'Malley, What do you think about a slight mod of the current implementation of _StringProper()?

Well, depending on use context it kind of works, but still very far from universal.

Local $a = [ _
    "Qu'est-ce qu'il fait chaud !", _
    "Le skipper c'est armel le cléac'h.", _
    "LE CléaC'H", _
    "O'Malley", _
    "autoit.exe", _
    "Διεθνής εβδομάδα χειμερ’Δινών αγώνων", _
    "Тиждень зимо՚Твих видів спорту (насправді 11 днів)", _
    "μζΣΨϑ'Σʤʞʫʀ" _


Names like O'Something can be dealt with using an alternation I guess. Not too much free time right now

