Jump to content

Reading .DocX Files


Szhlopp
 Share

Recommended Posts

Thanks to different UDF's around the forums I was able to create one to read those crazy .DocX files that Microsoft Word now uses >_<

I've gone through and opened as many 'docX' files as I had on my computer. All of them work now with this.

Below is the example AND the functions. I havn't taken the time to UDF format it. I would like people to make sure it works before I spend the time making this into one function. So give it a try and tell me how it works :)

ClipPut(_ReadDocXContent(@ScriptDir & '\DocXtest.docx'))


Func _ReadDocXContent($ReadLocation)
$Name = @ScriptDir & "\TempDoc.zip"
$UnZipName = @ScriptDir & '\DocXdoc'
FileCopy($ReadLocation, $Name, 1)
_Zip_Unzip($Name, "word\document.xml", $UnZipName, 16)
If @error Then Return 0
Sleep(200)
$Text = FileRead(@ScriptDir & '\DocXdoc\document.xml')
$RegEx = StringRegExp($Text, "<w:body>(.*?)<w:sect", 3)
$RegEx = StringRegExpReplace($Regex[0], "<w:br/>", @CRLF, 0)
$RegEx = StringRegExpReplace($Regex, "</w:p>", @CRLF & @CRLF, 0)
$RegEx = StringRegExpReplace($Regex, "<w:tab/>", @TAB, 0)
$RegEx = StringRegExpReplace($Regex, "<(.*?)>", "", 0)
$RegEx = StringRegExpReplace($Regex, "&lt;", "<", 0)
$RegEx = StringRegExpReplace($Regex, "&gt;", ">", 0)
$RegEx = StringRegExpReplace($Regex, "&amp;", "&", 0)
$RegEx = StringRegExpReplace($Regex, "â", '"', 0)
$RegEx = StringRegExpReplace($Regex, "â", '"', 0)
$RegEx = StringRegExpReplace($Regex, "â", "'", 0)
$RegEx = StringRegExpReplace($Regex, "â", "'", 0)
$RegEx = StringRegExpReplace($Regex, "â", "-", 0)
$RegEx = StringRegExpReplace($Regex, "â¦", "...", 0)
FileDelete($Name)
DirRemove($UnZipName, 1)
$RegEx = StringTrimRight($RegEx, 4)
Return $RegEx

EndFunc

Func _Zip_Unzip($hZipFile, $hFilename, $hDestPath, $flag = 4)
    Local $DLLChk = _Zip_DllChk()
    If $DLLChk <> 0 Then Return SetError($DLLChk, 0, 0) ;no dll
    If Not FileExists($hZipFile) Then Return SetError(1, 0, 0) ;no zip file
    
    If Not FileExists($hDestPath) Then DirCreate($hDestPath)
    
    $oApp = ObjCreate("Shell.Application")
    $hFolderitem = $oApp.NameSpace($hZipFile).Parsename($hFilename)
    
    $oApp.NameSpace($hDestPath).Copyhere($hFolderitem, $flag)
    
    
EndFunc   ;==>_Zip_Unzip

Func _Zip_DllChk()
    If Not FileExists(@SystemDir & "\zipfldr.dll") Then Return 2
    If Not RegRead("HKEY_CLASSES_ROOT\CLSID\{E88DCCE0-B7B3-11d1-A9F0-00AA0060FA31}", "") Then Return 3
    Return 0
EndFunc   ;==>_Zip_DllChk

If there are any other 'characters' I missed please let me know. I went through the keyboard and tried as many as I could. So hopefully nothing is missing there.

Hope this helps!

Szhlopp

Link to comment
Share on other sites

Nice!

But, what you should write is: "Hey guys, I got it! I finally found out what that .docx file actually is. It's a zipped .xml!!!"

The code that you provided is very much... something. :)

As far as I know something about windows shell object and parsename method that _Zip_Unzip() function have no chance of working the way you put it.

I don't actually know why are you molesting that StringRegExpReplace() when you can do it with StringReplace() and what is the meaning of last six replacements?

Your code need major modifications in order to work properly.

I suggest replacing that part when replacemants start with this:

$Text = StringReplace($Text, @CRLF, "")
$Text = StringRegExpReplace($Text, "<w:body>(.*?)</w:body>", '$1', 0)
$Text = StringReplace($Text, "</w:p>", @CRLF)
$Text = StringReplace($Text, "<w:cr/>", @CRLF)
$Text = StringReplace($Text, "<w:tab/>", @TAB)

$Text = StringRegExpReplace($Text, "<(.*?)>", "")
$Text = StringReplace($Text, "&lt;", "<")
$Text = StringReplace($Text, "&gt;", ">")
$Text = StringReplace($Text, "&amp;", "&")

and then returning $Text

Unzipping function needs to be modified too...

♡♡♡

.

eMyvnE

Link to comment
Share on other sites

wow... It's the first time somebody uses my zip functions >_<

anyway good work... May come useful someday :)

cheers

Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Link to comment
Share on other sites

Nice!

But, what you should write is: "Hey guys, I got it! I finally found out what that .docx file actually is. It's a zipped .xml!!!"

I could have. You said it for me though. >_<

The code that you provided is very much... something. :idiot:

As far as I know something about windows shell object and parsename method that _Zip_Unzip() function have no chance of working the way you put it.

Care to explain why? I know this needs some modification on making it into a single function. But as I stated above I would like to know it works before I mess with it. This is a rough draft

I don't actually know why are you molesting that StringRegExpReplace() when you can do it with StringReplace() and what is the meaning of last six replacements?

Take them out and you'll quickly find why they're in there. Word uses extended ASCII to write Quotes '"' and other marks. Periods weren't always coming out and even something as simple as "What's up?" wouldn't come out right.

Your code need major modifications in order to work properly.

I suggest replacing that part when replacemants start with this:

$Text = StringReplace($Text, @CRLF, "")
$Text = StringRegExpReplace($Text, "<w:body>(.*?)</w:body>", '$1', 0)
$Text = StringReplace($Text, "</w:p>", @CRLF)
$Text = StringReplace($Text, "<w:cr/>", @CRLF)
$Text = StringReplace($Text, "<w:tab/>", @TAB)

$Text = StringRegExpReplace($Text, "<(.*?)>", "")
$Text = StringReplace($Text, "&lt;", "<")
$Text = StringReplace($Text, "&gt;", ">")
$Text = StringReplace($Text, "&amp;", "&")

You totally messed up the functionality of words formatting at this point. Page break is a single line, a single line is "" and all the other chars come out all messed up(The last replaces).

and then returning $Text

Unzipping function needs to be modified too...

Is this because you have a UDF for it?

Please instead of telling me its all wrong and messed up. Tell me how to fix it. This post doesn't help me at all. You don't show me how it's messed up you just tell me it is.

Again this is a rough draft. So find out how to make it better and reply with that. If you can't then at LEAST describe how it's so completely messed up.

I have nothing against you. But what I said is very true. This post is useless as far as improving it. :)

wow... It's the first time somebody uses my zip functions

anyway good work... May come useful someday

cheers

Thanks. And thank you for the easy way of using the zip functionality.
Link to comment
Share on other sites

What words formatting functionality are you talking about???

If you wanted to keep that then why in the world would you write that script at all?

Page break is nothing, single line is nothing, word uses chr(22) to write quotation mark, â is not ' etc...

Text is stored between <w:body> and </w:body>.

Please explain difference between StringReplace($Text, "<w:tab/>", @TAB) and StringRegExpReplace($Regex, "<w:tab/>", @TAB, 0)

Unzipping function is ok but you are not feeding it right. You cannot use 'word\document.xml' in parsename method. That fact would indicate that unzipping is not working and you said that it works for you. My question is how?

It could be that we are using different MS Word (mine is ... wait to check ... Microsoft Office Word 2007), or our system is using different windows shell object or something else.

But, you say that I'm not contributing. So, pointing up to flaws (mistakes or whatever) is uncontributing? I think it's your ego, but that's just me.

There is thread on tis page too, that is about sending SMS from comp. If anyone actually "read" that script woud see that it's composed poorly and that is actually made up of few different scripts by using copy/pase method. Still people are "talking" about it and even make "improvements" to it, not seeing or trying to resolve original flaws. Peoples are like that, they see only what they want to see and not real full picture. But I'm generalizing now, and that is logical mistake, and not good at all >_<

Will try to contribute here. Try this script (it's based on yours and torelses, I'm merely... something):

Dim $file = FileOpenDialog("Choose .docx file", @DesktopDir, "Word docx file (*.docx)", 1)
If @error Then Exit

Dim $TxT = _ReadDocXContent($file)

;FileWrite(@ScriptDir & "\Extracted text.txt", $TxT)
MsgBox(0, "Extracted text", $TxT)

Func _ReadDocXContent($ReadLocation)

    Local $Name, $UnZipName, $TempZipName

    Local $i, $f_name = "~TempDoc"
    Do
        $i += 1
        $Name = @TempDir & "\" & $f_name & $i & ".zip"
    Until Not FileExists($Name)

    FileCopy($ReadLocation, $Name, 9)

    Local $j
    Do
        $j += 1
        $UnZipName = @TempDir & "\~DocXdoc" & $j
    Until Not FileExists($UnZipName)

    DirCreate($UnZipName)

    Local $k
    Do
        $k += 1
        $TempZipName = @TempDir & "\Temporary Directory " & $k & " for " & $f_name & $i & ".zip"
    Until Not FileExists($TempZipName)

    Local $oApp = ObjCreate("Shell.Application")

    $oApp.NameSpace($UnZipName).CopyHere($oApp.NameSpace($Name & '\word' ).ParseName("document.xml"), 4)

    Local $Text = FileRead($UnZipName & "\document.xml")

    DirRemove($UnZipName, 1)
    FileDelete($Name)
    DirRemove($TempZipName, 1)

    $Text = StringReplace($Text, @CRLF, "")
    $Text = StringRegExpReplace($Text, "<w:body>(.*?)</w:body>", '$1', 0)
    $Text = StringReplace($Text, "</w:p>", @CRLF)
    $Text = StringReplace($Text, "<w:cr/>", @CRLF)
    $Text = StringReplace($Text, "<w:tab/>", @TAB)

    $Text = StringRegExpReplace($Text, "<(.*?)>", "")
    $Text = StringReplace($Text, "&lt;", "<")
    $Text = StringReplace($Text, "&gt;", ">")
    $Text = StringReplace($Text, "&amp;", "&")

    Return $Text

EndFunc

There is no error checking inside function.

(hope that LimeSeed won't be too mad :) )

Edited by trancexx

♡♡♡

.

eMyvnE

Link to comment
Share on other sites

What words formatting functionality are you talking about???

If you wanted to keep that then why in the world would you write that script at all?

Page break is nothing, single line is nothing, word uses chr(22) to write quotation mark, â is not ' etc...

Text is stored between <w:body> and </w:body>.

Please explain difference between StringReplace($Text, "<w:tab/>", @TAB) and StringRegExpReplace($Regex, "<w:tab/>", @TAB, 0)

Unzipping function is ok but you are not feeding it right. You cannot use 'word\document.xml' in parsename method. That fact would indicate that unzipping is not working and you said that it works for you. My question is how?

It could be that we are using different MS Word (mine is ... wait to check ... Microsoft Office Word 2007), or our system is using different windows shell object or something else.

But, you say that I'm not contributing. So, pointing up to flaws (mistakes or whatever) is uncontributing? I think it's your ego, but that's just me.

There is thread on tis page too, that is about sending SMS from comp. If anyone actually "read" that script woud see that it's composed poorly and that is actually made up of few different scripts by using copy/pase method. Still people are "talking" about it and even make "improvements" to it, not seeing or trying to resolve original flaws. Peoples are like that, they see only what they want to see and not real full picture. But I'm generalizing now, and that is logical mistake, and not good at all >_<

Will try to contribute here. Try this script (it's based on yours and torelses, I'm merely... something):

*Snip*

There is no error checking inside function.

(hope that LimeSeed won't be too mad :) )

I seriously don't know what your problem is with me. All I did was post a draft on a docX reader and I get insulted and slammed by somebody who didn't know how to do it.

If ANYONE would like to finish this up and make a nice UDF go right ahead. Take all the credit you want for it. But before I let this post fall off the face of the planet I'd like to show everyone what happends when I remove the last RegEx's...

My version(With Correcting)

Trancexx(He removed the correcting)

MSO7 DocX I used:

DocXtest.zip

What does this do for everyone else? Is something messed up with my MSO7? I'm very curious what the deal is with our computers being so different.

Thank you all

Edited by Szhlopp
Link to comment
Share on other sites

I seriously don't know what your problem is with me. All I did was post a draft on a docX reader and I get insulted and slammed by somebody who didn't know how to do it.

If ANYONE would like to finish this up and make a nice UDF go right ahead. Take all the credit you want for it. But before I let this post fall off the face of the planet I'd like to show everyone what happends when I remove the last RegEx's...

My version(With Correcting)

Trancexx(He removed the correcting)

MSO7 DocX I used:

DocXtest.zip

What does this do for everyone else? Is something messed up with my MSO7? I'm very curious what the deal is with our computers being so different.

Thank you all

Insulted how? I' don't get it.

I did not slam you. I don't even see why are you so pissed off or whatever.

You did a good job. Exploring this area of ...whatever... is sometimes very difficult and demands something that you obviously have. That is virtue and I appreciate that.

Other thing is that what you did in your first post here. Come on... admit it. You lied. There is no (no!!!) way that script worked for you.

There are people that read this forum, you deliberately lied to them (me). Why???

Another thing I must admit to you. You were (are) right about aditional replacements.

But that StringRegExpReplace() method is just not right. I know that you made that RegEx/RegExRep Tester! out of your sig but still.

Your "nobody likes me" post is supposed to be what?

Don't get mad about these little things and learn to take a criticism (I wish I could :)).

To be ontopic...

Much more replacements have to be done to get the text properly. I found one article adout this and made some havy testing This is the result:

Dim $file = FileOpenDialog("Choose .docx file", @DesktopDir, "Word docx file (*.docx)", 1)
If @error Then Exit

Dim $TxT = _ReadDocXContent($file)

MsgBox(0, "Extracted text", $TxT)



Func _ReadDocXContent($ReadLocation)

    Local $extension = StringSplit($ReadLocation, ".", 1)
    $extension = $extension[$extension[0]]
    
    Local $hwnd = FileOpen($ReadLocation, 16)
    Local $header = FileRead($hwnd, 2)
    FileClose($hwnd)
    
    If $header <> '0x504B' Or $extension <> 'docx' Then Return SetError(1) ; not .docx file
    
    Local $Name, $UnZipName, $TempZipName

    Local $i, $f_name = "~TempDoc"
    Do
        $i += 1
        $Name = @TempDir & "\" & $f_name & $i & ".zip"
    Until Not FileExists($Name)

    FileCopy($ReadLocation, $Name, 9)

    Local $j
    Do
        $j += 1
        $UnZipName = @TempDir & "\~DocXdoc" & $j
    Until Not FileExists($UnZipName)

    DirCreate($UnZipName)

    Local $k
    Do
        $k += 1
        $TempZipName = @TempDir & "\Temporary Directory " & $k & " for " & $f_name & $i & ".zip"
    Until Not FileExists($TempZipName)

    Local $oApp = ObjCreate("Shell.Application")
    
    If Not IsObj($oApp) Then Return SetError(2) ; highly unlikely but could happen
    
    $oApp.NameSpace($UnZipName).CopyHere($oApp.NameSpace($Name & '\word' ).ParseName("document.xml"), 4)

    Local $Text = FileRead($UnZipName & "\document.xml")

    DirRemove($UnZipName, 1)
    FileDelete($Name)
    DirRemove($TempZipName, 1)

    $Text = StringReplace($Text, @CRLF, "")
    $Text = StringRegExpReplace($Text, "<w:body>(.*?)</w:body>", '$1', 0)
    $Text = StringReplace($Text, "</w:p>", @CRLF)
    $Text = StringReplace($Text, "<w:cr/>", @CRLF)
    $Text = StringReplace($Text, "<w:br/>", @CRLF)
    $Text = StringReplace($Text, "<w:tab/>", @TAB)

    $Text = StringRegExpReplace($Text, "<(.*?)>", "")
    
    $Text = StringReplace($Text, "&lt;", "<")
    $Text = StringReplace($Text, "&gt;", ">")
    $Text = StringReplace($Text, "&amp;", "&")

    $Text = StringReplace($Text, Chr(226) & Chr(130) & Chr(172), Chr(128))
    $Text = StringReplace($Text, Chr(194) & Chr(129), Chr(129))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(154), Chr(130))
    $Text = StringReplace($Text, Chr(198) & Chr(146), Chr(131))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(158), Chr(132))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(166), Chr(133))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(160), Chr(134))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(161), Chr(135))
    $Text = StringReplace($Text, Chr(203) & Chr(134), Chr(136))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(176), Chr(137))
    $Text = StringReplace($Text, Chr(197) & Chr(160), Chr(138))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(185), Chr(139))
    $Text = StringReplace($Text, Chr(197) & Chr(146), Chr(140))
    $Text = StringReplace($Text, Chr(194) & Chr(141), Chr(141))
    $Text = StringReplace($Text, Chr(197) & Chr(189), Chr(142))
    $Text = StringReplace($Text, Chr(194) & Chr(143), Chr(143))
    $Text = StringReplace($Text, Chr(194) & Chr(144), Chr(144))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(152), Chr(145))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(153), Chr(146))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(156), Chr(147))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(157), Chr(148))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(162), Chr(149))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(147), Chr(150))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(148), Chr(151))
    $Text = StringReplace($Text, Chr(203) & Chr(156), Chr(152))
    $Text = StringReplace($Text, Chr(226) & Chr(132) & Chr(162), Chr(153))
    $Text = StringReplace($Text, Chr(197) & Chr(161), Chr(154))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(186), Chr(155))
    $Text = StringReplace($Text, Chr(197) & Chr(147), Chr(156))
    $Text = StringReplace($Text, Chr(194) & Chr(157), Chr(157))
    $Text = StringReplace($Text, Chr(197) & Chr(190), Chr(158))
    $Text = StringReplace($Text, Chr(197) & Chr(184), Chr(159))

    For $x = 160 To 191
        $Text = StringReplace($Text, Chr(194) & Chr($x), Chr($x))
    Next

    For $x = 192 To 255
        $Text = StringReplace($Text, Chr(195) & Chr($x - 64), Chr($x))
    Next

    Return $Text

EndFunc

Szhlopp try it.

And no one is gonna steal this from you. It's your idea and should be proud of that.

btw, it would be nice of you to give some credit to steve8tch and w0uter, not for this but for that other thing. I'm not seeing it.

♡♡♡

.

eMyvnE

Link to comment
Share on other sites

Insulted how? I' don't get it.

I did not slam you. I don't even see why are you so pissed off or whatever.

You did a good job. Exploring this area of ...whatever... is sometimes very difficult and demands something that you obviously have. That is virtue and I appreciate that.

Other thing is that what you did in your first post here. Come on... admit it. You lied. There is no (no!!!) way that script worked for you.

There are people that read this forum, you deliberately lied to them (me). Why???

Another thing I must admit to you. You were (are) right about aditional replacements.

But that StringRegExpReplace() method is just not right. I know that you made that RegEx/RegExRep Tester! out of your sig but still.

Your "nobody likes me" post is supposed to be what?

Don't get mad about these little things and learn to take a criticism (I wish I could :)).

To be ontopic...

Much more replacements have to be done to get the text properly. I found one article adout this and made some havy testing This is the result:

Dim $file = FileOpenDialog("Choose .docx file", @DesktopDir, "Word docx file (*.docx)", 1)
If @error Then Exit

Dim $TxT = _ReadDocXContent($file)

MsgBox(0, "Extracted text", $TxT)



Func _ReadDocXContent($ReadLocation)

    Local $extension = StringSplit($ReadLocation, ".", 1)
    $extension = $extension[$extension[0]]
    
    Local $hwnd = FileOpen($ReadLocation, 16)
    Local $header = FileRead($hwnd, 2)
    FileClose($hwnd)
    
    If $header <> '0x504B' Or $extension <> 'docx' Then Return SetError(1) ; not .docx file
    
    Local $Name, $UnZipName, $TempZipName

    Local $i, $f_name = "~TempDoc"
    Do
        $i += 1
        $Name = @TempDir & "\" & $f_name & $i & ".zip"
    Until Not FileExists($Name)

    FileCopy($ReadLocation, $Name, 9)

    Local $j
    Do
        $j += 1
        $UnZipName = @TempDir & "\~DocXdoc" & $j
    Until Not FileExists($UnZipName)

    DirCreate($UnZipName)

    Local $k
    Do
        $k += 1
        $TempZipName = @TempDir & "\Temporary Directory " & $k & " for " & $f_name & $i & ".zip"
    Until Not FileExists($TempZipName)

    Local $oApp = ObjCreate("Shell.Application")
    
    If Not IsObj($oApp) Then Return SetError(2) ; highly unlikely but could happen
    
    $oApp.NameSpace($UnZipName).CopyHere($oApp.NameSpace($Name & '\word' ).ParseName("document.xml"), 4)

    Local $Text = FileRead($UnZipName & "\document.xml")

    DirRemove($UnZipName, 1)
    FileDelete($Name)
    DirRemove($TempZipName, 1)

    $Text = StringReplace($Text, @CRLF, "")
    $Text = StringRegExpReplace($Text, "<w:body>(.*?)</w:body>", '$1', 0)
    $Text = StringReplace($Text, "</w:p>", @CRLF)
    $Text = StringReplace($Text, "<w:cr/>", @CRLF)
    $Text = StringReplace($Text, "<w:br/>", @CRLF)
    $Text = StringReplace($Text, "<w:tab/>", @TAB)

    $Text = StringRegExpReplace($Text, "<(.*?)>", "")
    
    $Text = StringReplace($Text, "&lt;", "<")
    $Text = StringReplace($Text, "&gt;", ">")
    $Text = StringReplace($Text, "&amp;", "&")

    $Text = StringReplace($Text, Chr(226) & Chr(130) & Chr(172), Chr(128))
    $Text = StringReplace($Text, Chr(194) & Chr(129), Chr(129))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(154), Chr(130))
    $Text = StringReplace($Text, Chr(198) & Chr(146), Chr(131))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(158), Chr(132))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(166), Chr(133))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(160), Chr(134))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(161), Chr(135))
    $Text = StringReplace($Text, Chr(203) & Chr(134), Chr(136))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(176), Chr(137))
    $Text = StringReplace($Text, Chr(197) & Chr(160), Chr(138))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(185), Chr(139))
    $Text = StringReplace($Text, Chr(197) & Chr(146), Chr(140))
    $Text = StringReplace($Text, Chr(194) & Chr(141), Chr(141))
    $Text = StringReplace($Text, Chr(197) & Chr(189), Chr(142))
    $Text = StringReplace($Text, Chr(194) & Chr(143), Chr(143))
    $Text = StringReplace($Text, Chr(194) & Chr(144), Chr(144))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(152), Chr(145))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(153), Chr(146))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(156), Chr(147))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(157), Chr(148))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(162), Chr(149))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(147), Chr(150))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(148), Chr(151))
    $Text = StringReplace($Text, Chr(203) & Chr(156), Chr(152))
    $Text = StringReplace($Text, Chr(226) & Chr(132) & Chr(162), Chr(153))
    $Text = StringReplace($Text, Chr(197) & Chr(161), Chr(154))
    $Text = StringReplace($Text, Chr(226) & Chr(128) & Chr(186), Chr(155))
    $Text = StringReplace($Text, Chr(197) & Chr(147), Chr(156))
    $Text = StringReplace($Text, Chr(194) & Chr(157), Chr(157))
    $Text = StringReplace($Text, Chr(197) & Chr(190), Chr(158))
    $Text = StringReplace($Text, Chr(197) & Chr(184), Chr(159))

    For $x = 160 To 191
        $Text = StringReplace($Text, Chr(194) & Chr($x), Chr($x))
    Next

    For $x = 192 To 255
        $Text = StringReplace($Text, Chr(195) & Chr($x - 64), Chr($x))
    Next

    Return $Text

EndFunc

Szhlopp try it.

And no one is gonna steal this from you. It's your idea and should be proud of that.

btw, it would be nice of you to give some credit to steve8tch and w0uter, not for this but for that other thing. I'm not seeing it.

I never lied about this. This REALLY does work for me. Now it doesn't work for you but that isn't my fault. I never had ANY intention of lying to anyone...

I am not mad, pissed, or even angry with you or anyone else on this forum. And I never posted a 'Nobody likes me' thread. I was just tired of being treated like an idiot...

I think it's your ego, but that's just me.

what you should write is: "Hey guys, I got it! I finally found out what that .docx file actually is. It's a zipped .xml!!!"

why in the world would you write that script at all?

If anyone actually "read" that script woud see that it's composed poorly and that is actually made up of few different scripts by using copy/pase method

As of right now I'm not going to be finishing this. If there really is any interest maybe somebody can pick it up and finish it >_<

And Trancexx I don't want to be your enemy. So please just drop the insults and we can move on.

As far as the RegEx tester goes. I probably could put a comment inside the au3 that says "Original GUI layout by ......". If you notice, I never even put my name in the title or code. I'll update that script soon...

Link to comment
Share on other sites

  • 4 years later...

Have you tried the Word UDF by water >>

Please re-direct any questions to general help support next time >> http://www.autoitscript.com/forum/forum/2-general-help-and-support/

Edited by guinness

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Welcome to AutoIt and the forum!

That's quite an old thread you posted in (4 1/2 years old). Best is to create a new thread in the Help and Support forum.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...