frank10 Posted November 9, 2021 Share Posted November 9, 2021 I'm trying to find/replace some chars into a .docx like this: #include <Word.au3> Local $oWord = _Word_Create() Local $oDoc = _Word_DocOpen($oWord, "test.docx") Local $oRangeFound, $oRangeText, $oSearchRange = _Word_DocRangeSet($oDoc, -1) ; find at least 2 spaces only after numbers: "wordA 123 wordB wordC" --> "wordA 123 wordB wordC" $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*) {2,}", "\1", 2, 0, 0,0,1) It does not work... How to do it? Link to comment Share on other sites More sharing options...
water Posted November 9, 2021 Share Posted November 9, 2021 _Word_DocFindReplace does not support Regular Expressions. I'm not sure it is possible with Word at all. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
frank10 Posted November 9, 2021 Author Share Posted November 9, 2021 (edited) But I tried with this and it works: $oRangeFound = _Word_DocFindReplace($oDoc, "NUTRITION([0-9]*)", "NUTRITION#: \1", 2, 0, 0,0,1) ; NUTRITION23 kcal --> NUTRITION#: 23 kcal Also here they make some examples... https://translationjournal.net/journal/15msw.htm Edited November 9, 2021 by frank10 Link to comment Share on other sites More sharing options...
Skysnake Posted November 9, 2021 Share Posted November 9, 2021 The only way I can think, and I have never done this, is to open the entire document as an object and step through in portions. Run the Regex on a portion at a time...? Skysnake Why is the snake in the sky? Link to comment Share on other sites More sharing options...
JockoDundee Posted November 9, 2021 Share Posted November 9, 2021 8 hours ago, frank10 said: How to do it? Word doesn’t seem to support “*” greedy character. What happens if you just omit it: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]) {2,}", "\1", 2, 0, 0,0,1) Code hard, but don’t hard code... Link to comment Share on other sites More sharing options...
frank10 Posted November 10, 2021 Author Share Posted November 10, 2021 (edited) 10 hours ago, JockoDundee said: Word doesn’t seem to support “*” greedy character. What happens if you just omit it: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]) {2,}", "\1", 2, 0, 0,0,1) Doesn't work. It's not the * that gives problem, it seems it is the "{2,}". In fact this works: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)[ ]*", "\1°", 2, 0, 0,0,1) BUT as * means 0 or more, it changes also numbers followed by one space... (strangely it should change also if number is followed by no space, instead it does not change this...) 123word --> no change 123 word --> 123°word 123 word --> 123° word What I want is change only from 2 spaces or more... Of course I can do workarounds like: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*) ", "\1 ", 2, 0, 0,0,1) $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*) ", "\1 ", 2, 0, 0,0,1) But it would be a lot better to find out a similar regExp, also for other future uses... Edited November 10, 2021 by frank10 Link to comment Share on other sites More sharing options...
mikell Posted November 10, 2021 Share Posted November 10, 2021 https://wordmvp.com/FAQs/General/UsingWildcards.htm Link to comment Share on other sites More sharing options...
frank10 Posted November 10, 2021 Author Share Posted November 10, 2021 4 minutes ago, mikell said: https://wordmvp.com/FAQs/General/UsingWildcards.htm Yes, they say that you can use {2,}, but in fact it doesn't work... this: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)[ ]{2,}", "\1°", 2, 0, 0,0,1) consolewrite("__err:" & @error & "__extended:" & @extended & @crlf) gives: __err:3__extended:-2147352567 Link to comment Share on other sites More sharing options...
water Posted November 10, 2021 Share Posted November 10, 2021 Extended -2147352567 (decimal) is 0x80020009 (hex) and stands for "General Error". You need a full COM error handler to get more detailed information about the error. Unfortunately this is a bit complex caused by the way AutoIt handles COM errors in the Word UDF. expandcollapse popup#include <Word.au3> #include <MsgBoxConstants.au3> Global $oError = ObjEvent("AutoIt.Error", "__Word_COMErrFuncEX") ; Add your code to find/replace text here by calling the modified _Word_DocFindReplaceEX function! Exit ; #FUNCTION# ==================================================================================================================== ; Author ........: water (based on the Word UDF written by Bob Anthony) ; Modified ......: ; =============================================================================================================================== Func _Word_DocFindReplaceEX($oDoc, $sFindText = Default, $sReplaceWith = Default, $iReplace = Default, $vSearchRange = Default, $bMatchCase = Default, $bMatchWholeWord = Default, $bMatchWildcards = Default, $bMatchSoundsLike = Default, $bMatchAllWordForms = Default, $bForward = Default, $iWrap = Default, $bFormat = Default) If $sFindText = Default Then $sFindText = "" If $sReplaceWith = Default Then $sReplaceWith = "" If $iReplace = Default Then $iReplace = $WdReplaceAll If $vSearchRange = Default Then $vSearchRange = 0 If $bMatchCase = Default Then $bMatchCase = False If $bMatchWholeWord = Default Then $bMatchWholeWord = False If $bMatchWildcards = Default Then $bMatchWildcards = False If $bMatchSoundsLike = Default Then $bMatchSoundsLike = False If $bMatchAllWordForms = Default Then $bMatchAllWordForms = False If $bForward = Default Then $bForward = True If $iWrap = Default Then $iWrap = $WdFindContinue If $bFormat = Default Then $bFormat = False If Not IsObj($oDoc) Then Return SetError(1, 0, 0) Switch $vSearchRange Case -1 $vSearchRange = $oDoc.Application.Selection.Range Case 0 $vSearchRange = $oDoc.Range() Case Else If Not IsObj($vSearchRange) Then Return SetError(2, 0, 0) EndSwitch Local $oFind = $vSearchRange.Find $oFind.ClearFormatting() $oFind.Replacement.ClearFormatting() Local $bReturn = $oFind.Execute($sFindText, $bMatchCase, $bMatchWholeWord, $bMatchWildcards, $bMatchSoundsLike, _ $bMatchAllWordForms, $bForward, $iWrap, $bFormat, $sReplaceWith, $iReplace) If @error Or Not $bReturn Then Return SetError(3, @error, 0) Return 1 EndFunc ;==>_Word_DocFindReplaceEX Func __Word_COMErrFuncEX() Local $bHexNumber = Hex($oError.number, 8) Local $sError = "COM Error Encountered in " & @ScriptName & @CRLF & _ "@AutoItVersion = " & @AutoItVersion & @CRLF & _ "@AutoItX64 = " & @AutoItX64 & @CRLF & _ "@Compiled = " & @Compiled & @CRLF & _ "@OSArch = " & @OSArch & @CRLF & _ "@OSVersion = " & @OSVersion & @CRLF & _ "Scriptline = " & $oError.scriptline & @CRLF & _ "NumberHex = 0x" & $bHexNumber & @CRLF & _ "Number = " & $oError.number & @CRLF & _ "WinDescription = " & StringStripWS($oError.WinDescription, $STR_STRIPTRAILING) & @CRLF & _ "Description = " & StringStripWS($oError.description, $STR_STRIPTRAILING) & @CRLF & _ "Source = " & $oError.Source & @CRLF & _ "HelpFile = " & $oError.HelpFile & @CRLF & _ "HelpContext = " & $oError.HelpContext & @CRLF & _ "LastDllError = " & $oError.LastDllError MsgBox($MB_ICONERROR, "Debug Info", $sError) EndFunc ;==>__AD_ErrorHandler My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
frank10 Posted November 10, 2021 Author Share Posted November 10, 2021 Ok, that error says: "Description = The Find What text contains a Pattern Match expression which is not valid." But how should it be written? ([0-9]*)[ ]{2,} As for the links above, it seems correct... Link to comment Share on other sites More sharing options...
water Posted November 10, 2021 Share Posted November 10, 2021 I get the impression that MS Word does not fully support Regualr Expressions (https://vlasovstudio.com/regent/documentation/Microsoft-Word-Wildcards-as-Regular-Expressions.html). Unfortunatley I'm not familiar with the Wildcards supported by MS Word. But I suggest to try in MS Word before using _Word_FindReplace. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki Link to comment Share on other sites More sharing options...
Solution mikell Posted November 10, 2021 Solution Share Posted November 10, 2021 4 hours ago, frank10 said: Yes, they say that you can use {2,}, but in fact it doesn't work... 5 hours ago, frank10 said: BUT as * means 0 or more, it changes also numbers followed by one space... (strangely it should change also if number is followed by no space, instead it does not change this...) So did you try this ? "([0-9]*)[ ][ ]*" JockoDundee 1 Link to comment Share on other sites More sharing options...
frank10 Posted November 10, 2021 Author Share Posted November 10, 2021 (edited) 3 hours ago, mikell said: So did you try this ? "([0-9]*)[ ][ ]*" Thank you mikell: good catch! The only thing, with yours it gets also: 123a --> no change 123 aa --> no change 123 aaa --> ok 123 aaa --> ok wordA 123 wordB wordC --> NOT ok it changes also wordB wordC... Instead with this: "([0-9])[ ][ ]*" it's perfect! 123a --> no change 123 aa --> no change 123 aaa --> ok 123 aaa --> ok wordA 123 wordB wordC --> no change Good workaround. Edited November 10, 2021 by frank10 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now