Jump to content

How can I get the headings from a MS Word document?


Go to solution Solved by water,

Recommended Posts

How can I get a list of all the headings in a Microsoft Word document by using AutoIt?

I tried:

#include <Word.au3>
#include <MsgBoxConstants.au3>

Global Const $wdRefTypeHeading = 1 ; Heading

$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)

MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

But it did not function well..

For example, it just get 1 heading from my rich document that have many headings!

I also tried this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next

And this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdSentence, $i, $wdSentence, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next 

But the Range.Style property didn't work in AutoIt..

Could someone help me how to get a list of all the headings in a Word document?

Link to comment
Share on other sites

#include <Word.au3> 
#include <MsgBoxConstants.au3>
Global Const $wdRefTypeHeading = 1 ; Heading 
$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)
MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

Where do you set $oDoc in this code?

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

In the based index of a Word document:

$oWord = _Word_Create()
$oDoc = _Word_DocGet($oWord, 1)

Actually, it's working properly on a very simple Word document..

Maybe the Document.GetCrossReferenceItems method considered the styles of the headings from my rich and big document, as the styles of their parent styles, such as numbered items, etc... Because the headings on my document are also have another styles.

Link to comment
Share on other sites

I will test as soon as I'm in my office again.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

The document you posted has defined new heading styles (they are named _Headingx - note the leading "_"). That's why the GetCrossReferenceItems method doesn't list this "headings".

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

This script searches for non-standard styles. Unfortunately you can't use wildcards to search for styles. You need to specify each style individually.

#include <Word.au3>
$oWord = _Word_Create()
Global $sDocument = @ScriptDir & "\Beginning_sample.docx"
$oDoc = _Word_DocOpen($oWord, $sDocument)
; Find first "_Heading 1"
$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)
; Find all "_Heading 1" till end of document
While 1
    $oRangeRound = _Word_DocFindEX($oDoc, Default, Default, $oRangeFound, Default, Default, Default, Default, Default, Default, "_Heading 1")
    If @error <> 0 Then ExitLoop
    ConsoleWrite($oRangeFound.Text & @LF)
WEnd
_Word_DocClose($oDoc)
_Word_Quit($oWord)

Func _Word_DocFindEX($oDoc, $sFindText = Default, $vSearchRange = Default, $oFindRange = Default, $bForward = Default, $bMatchCase = Default, $bMatchWholeWord = Default, $bMatchWildcards = Default, $bMatchSoundsLike = Default, $bMatchAllWordForms = Default, $vFormat = Default)
    Global $bFormat = False
    If $sFindText = Default Then $sFindText = ""
    If $vSearchRange = Default Then $vSearchRange = 0
    If $bForward = Default Then $bForward = True
    If $bMatchCase = Default Then $bMatchCase = False
    If $bMatchWholeWord = Default Then $bMatchWholeWord = False
    If $bMatchWildcards = Default Then $bMatchWildcards = False
    If $bMatchSoundsLike = Default Then $bMatchSoundsLike = False
    If $bMatchAllWordForms = Default Then $bMatchAllWordForms = False
    If Not IsObj($oDoc) Then Return SetError(1, 0, 0)
    Switch $vSearchRange
        Case -1
            $vSearchRange = $oDoc.Application.Selection.Range
        Case 0
            $vSearchRange = $oDoc.Range()
        Case Else
            If Not IsObj($vSearchRange) Then Return SetError(2, 0, 0)
    EndSwitch
    If $oFindRange = Default Then
        $oFindRange = $vSearchRange.Duplicate()
    Else
        If Not IsObj($oFindRange) Then Return SetError(3, 0, 0)
        If $bForward = True Then
            $oFindRange.Start = $oFindRange.End ; Search forward
            $oFindRange.End = $vSearchRange.End
        Else
            $oFindRange.End = $oFindRange.Start ; Search backward
            $oFindRange.Start = $vSearchRange.Start
        EndIf
    EndIf
    $oFindRange.Find.ClearFormatting()
    If $vFormat <> Default Then
        $bFormat = True
        $oFindRange.Find.Style = $vFormat
    EndIf
    $oFindRange.Find.Execute($sFindText, $bMatchCase, $bMatchWholeWord, $bMatchWildcards, $bMatchSoundsLike, _
            $bMatchAllWordForms, $bForward, $WdFindStop, $bFormat)
    If @error Or Not $oFindRange.Find.Found Then Return SetError(4, 0, 0)
    Return $oFindRange
EndFunc   ;==>_Word_DocFind

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

How do you know the names of their heading styles?

And upon testing this code with my posted document:

$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)

It produce an error:

==> Variable must be of type "Object".:
ConsoleWrite($oRangeFound.Text & @LF)
ConsoleWrite($oRangeFound^ ERROR
Link to comment
Share on other sites

  • Solution

I opened the document and checked the used style of the heading.(it is being displayed in the ribbon).
 
My bad. You need to replace
"_Word_DocFind" with "_Word_DocFindEx".

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

  • 1 year later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...