rmock Posted January 24 Share Posted January 24 I am a bit of a novice with AutoIT but I have exhaustively searched the libraries and threads here for something similar to what I am trying to do. I would like to split an RTF at a specific line in the file, which contains the text "Official". So, if there are 2 pages and there is 1 mention of "Official" I would like 2 RTF's. I am already successfully using GUI streaming to open an RTF, determine encoding, and output the plain text lines to an array. However, I would like to split the RTF into multiple before gleaning the plain text. Could someone provide a very high-level example so that I have somewhere to start? Thanks a ton! Link to comment Share on other sites More sharing options...
ioa747 Posted January 24 Share Posted January 24 https://www.autoitscript.com/forum/topic/209609-how-to-hide-chars-in-_guictrlrichedit/#comment-1512718 I know that I know nothing Link to comment Share on other sites More sharing options...
rmock Posted January 25 Author Share Posted January 25 Thank you for your time, ioa. I took a look at that thread and the FriendlyHide function, which gave me some ideas. I would be grateful for continued assistance. I am outputting the start of the document through the 2nd occurrence of the delimiter like this: _GUICtrlRichEdit_StreamFromFile($hRichEdit, @ScriptDir & "\test.RTF") $delimeter = "Official:" $delimLen = StringLen($delimeter) $occurrence = 1 ; find 2nd occurrence of delimeter $finding = StringInStr(_GUICtrlRichEdit_GetText($hRichEdit), $delimeter, 0, $occurrence + 1) ; look for multiple found in this rtf If $finding > 0 Then _GUICtrlRichEdit_SetSel($hRichEdit, 0, $finding) EndIf _GUICtrlRichEdit_StreamToFile($hRichEdit, @ScriptDir & "\new.RTF") However, this outputs the start of the document until about 200 characters before the 2nd occurrence instead of at the 2nd occurrence. Ideas? Link to comment Share on other sites More sharing options...
ioa747 Posted January 26 Share Posted January 26 (edited) I tried it and it works, the only change I made is: _GUICtrlRichEdit_StreamFromFile($hRichEdit, @ScriptDir & "\test.RTF", $FO_UTF8_NOBOM) _GUICtrlRichEdit_SetSel($hRichEdit, 0, $finding - 1) Edit: maybe it's from .rtf, upload an example .rtf Edited January 26 by ioa747 I know that I know nothing Link to comment Share on other sites More sharing options...
rmock Posted January 26 Author Share Posted January 26 I tried the $FO_UTF8_NO BOM and got the same result. I wonder if the shortening of characters may be due to these first couple of lines? $hGui = GUICreate("Extract text from RTF", -1, -1) $hRichEdit = _GUICtrlRichEdit_Create($hGui, "", -1, -1) _GUICtrlRichEdit_StreamFromFile($hRichEdit, @ScriptDir & "\test.RTF", $FO_UTF8_NOBOM) I can share an example RTF but it may take me a bit. Until then, I thought something might pop as the issue after looking at these. Link to comment Share on other sites More sharing options...
rmock Posted January 26 Author Share Posted January 26 (edited) I've attached a sample RTF here: sample.rtf I've modified my code a great deal and am happy with the results except for the mentioned issue of output cutting off about 200 characters short for each iteration. It also cuts off the right margin. I'm not super worried about that but if this thread results in a solution to that problem as well that would be fantastic. Edited January 26 by rmock Link to comment Share on other sites More sharing options...
Solution ioa747 Posted January 26 Solution Share Posted January 26 (edited) with sample.rtf, I have the same behavior as you The output is about 200 characters before the 2nd occurrence I only succeeded in plain text expandcollapse popup; https://www.autoitscript.com/forum/topic/211403-split-an-rtf-at-delimiter #AutoIt3Wrapper_Au3Check_Parameters=-d -w 1 -w 2 -w 3 -w 4 -w 5 -w 6 -w 7 #include <MsgBoxConstants.au3> #include <FileConstants.au3> SplitIt(@ScriptDir & "\sample.rtf") ;---------------------------------------------------------------------------------------- Func SplitIt($sFilePath) ; Open the file for reading and store the handle to a variable. Local $hFileOpen = FileOpen($sFilePath, $FO_READ) If $hFileOpen = -1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.") Return False EndIf ; Read the contents of the file using the handle returned by FileOpen. Local $sFileTxt = FileRead($hFileOpen) ; Close the handle returned by FileOpen. FileClose($hFileOpen) Local $sDelim, $aSplit, $sNewRtf ; find 'Official:' formated $sDelim = "{\rtlch\fcs1 \ab\af0\afs20 \ltrch\fcs0 \b\f0\fs20\kerning0\insrsid1062128 \hich\af0\dbch\af31505\loch\f0 Official\hich\af0\dbch\af31505\loch\f0 : }" $aSplit = StringSplit($sFileTxt, $sDelim, 1) If $aSplit[0] > 2 Then $sNewRtf = $aSplit[1] & $sDelim & $aSplit[2] EndIf ; find last page brake Local $iPageBr = StringInStr($sNewRtf, "\page", 0, -1) -1 ; remove page brake and close $sNewRtf = StringLeft($sNewRtf, $iPageBr) & "}}" ; save $StringA and $StringB as rtf Local $sNewFilePath = StringTrimRight($sFilePath, 4) & "_NEW.rtf" If FileExists($sNewFilePath) Then FileDelete($sNewFilePath) FileWrite($sNewFilePath, $sNewRtf) EndFunc ;==>SplitIt https://www.arcdev.hu/manuals/standard/rtf/rtfspeci.pdf Edited January 27 by ioa747 fixed it. I know that I know nothing Link to comment Share on other sites More sharing options...
rmock Posted January 26 Author Share Posted January 26 When I try to open sample_NEW.rtf I get: "Word was unable to read this document. It may be corrupt." I'll continue to play around with your script to see if I can get a different result. Link to comment Share on other sites More sharing options...
ioa747 Posted January 27 Share Posted January 27 (edited) 12 hours ago, rmock said: Word was unable to read this document. It may be corrupt. do one more test i fixed it. I think Edited January 27 by ioa747 I know that I know nothing Link to comment Share on other sites More sharing options...
rmock Posted January 29 Author Share Posted January 29 Beautiful! Is there documentation somewhere to explain the syntax used in the $sDelim variable? I would like to change the delimiter to include a space and a 2nd word. Link to comment Share on other sites More sharing options...
ioa747 Posted January 29 Share Posted January 29 6 minutes ago, rmock said: Is there documentation somewhere to explain the syntax https://www.arcdev.hu/manuals/standard/rtf/rtfspeci.pdf Open the .rtf with notepad as txt a help is the notepad++ with https://raw.githubusercontent.com/nakohdo/NPP.RTF/master/userDefineLang_RTF.xml so you can see and respect the curly brace pairs I know that I know nothing Link to comment Share on other sites More sharing options...
rmock Posted January 29 Author Share Posted January 29 (edited) Thanks, ioa747! I was able to create a new delimiter using this method. When I attempt to create a new RTF for each occurrence of the delimiter I again get the "Word was unable to read this document. It may be corrupt" error. I imagine it has to do with how I am using the page breaks within the loop. ; open rtf in Notepad++ and find formatting for delimiter line $sDelim = "{\rtlch\fcs1 \ab\af0\afs20 \ltrch\fcs0 \b\f0\fs20\kerning0\insrsid1062128 \hich\af0\dbch\af31505\loch\f0 Official Transcript: }" $aSplit = StringSplit($sFileTxt, $sDelim, 1) $loop = 1 For $i = 0 To UBound($aSplit) - 1 If $i <> 0 Then If UBound($aSplit) > 2 Then $sNewRtf = $aSplit[$i] EndIf ; find last page break Global $iPageBr = StringInStr($sNewRtf, "\page", 0, -1) -1 ; remove page break and close $sNewRtf = StringLeft($sNewRtf, $iPageBr) & "}}" ; save $StringA and $StringB as rtf Local $sNewFilePath = StringTrimRight($sFilePath, 4) & "_NEW_" & $loop & ".rtf" If FileExists($sNewFilePath) Then FileDelete($sNewFilePath) FileWrite($sNewFilePath, $sNewRtf) EndIf $loop += 1 Next Edited January 30 by rmock Link to comment Share on other sites More sharing options...
ioa747 Posted January 29 Share Posted January 29 (edited) each piece needs a header Edit: so you need to split it into. <header> + <official1> <header> + <official2> Edit: open it with wordpad (not word), because it makes formatting simpler select the section you want to separate and paste as a new document in wordpad, then open it as text to see the formatting , and at what point it divides it and what it uses for a header Edited January 29 by ioa747 I know that I know nothing Link to comment Share on other sites More sharing options...
rmock Posted January 31 Author Share Posted January 31 Thank you for your diligent assistance, ioa747! I am over the hump with this script. Here is what is currently working for me: expandcollapse popup; split RTF ;------------------------------------------------------------ SplitIt(@ScriptDir & "\test.rtf") Func SplitIt($sFilePath) ; Open the file for reading and store the handle to a variable Local $hFileOpen = FileOpen($sFilePath, $FO_READ) If $hFileOpen = -1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.") Return False EndIf ; Read the contents of the file using the handle returned by FileOpen Local $sFileTxt = FileRead($hFileOpen) ; Close the handle returned by FileOpen. FileClose($hFileOpen) Local $sDelim, $aSplit, $sNewRtf, $iPageBr ; open rtf in Notepad++ and find formatting for delimiter line $sDelim = "\pard\qc\b Official: " $aSplit = StringSplit($sFileTxt, $sDelim, 1) $header = $aSplit[1] If UBound($aSplit) > 2 Then For $i = 2 To UBound($aSplit) - 1 $sNewRtf = $header & $sDelim & $aSplit[$i] ; find page breaks $iPageBr = StringInStr($sNewRtf, "\page", 0, -1) If $iPageBr = 0 Then ; no page break so this is the last iteration ; output normally $sNewRtf = $sNewRtf Else ; page breaks found so this is not the last iteration ; remove page break and close $sNewRtf = StringLeft($sNewRtf, $iPageBr) & "}}" EndIf Local $sNewFilePath = StringTrimRight($sFilePath, 4) & "_" & $i & ".rtf" If FileExists($sNewFilePath) Then FileDelete($sNewFilePath) FileWrite($sNewFilePath, $sNewRtf) Next EndIf EndFunc ioa747 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now