Jump to content

Reading XML content and generate new tag in other XML


Recommended Posts

Hi Experts,

Hope your having a good day today.😊

I have this problem right now on reading this unique XML element (below) and I tried searching in the forum and it gives me a lot of topics, however, I can't figure out how to manage my condition.

This is my sample XML that I need to get from the "Test.xml" and I need to retag them then paste it in my other xml which is "Other.xml".

<BY><PN><SN>&NA;</SN></PN><BT>
<P>An event is serious (based on the ICH definition) when the patient outcome is&colon;</P><LS T="B"><LM>
<P>death</P></LM><LM>
<P>life-threatening</P></LM><LM>
<P>hospitalisation</P></LM><LM>
<P>disability</P></LM><LM>
<P>congenital anomaly</P></LM><LM>
<P>other medically important event</P></LM></LS></BT></BY>

 

This is the "Other.xml" output that I'm expecting that should be paste before the element <body>.

<Para Type="Attribute"><Emphasis Type="Bold">Information</Emphasis>
<UnorderedList Mark="None">
<Heading>An event is serious (based on the ICH definition) when the patient outcome is:</Heading>
<ItemContent><Para>* death</Para></ItemContent>
<ItemContent><Para>* life-threatening</Para></ItemContent>
<ItemContent><Para>* hospitalisation</Para></ItemContent>
<ItemContent><Para>* disability</Para></ItemContent>
<ItemContent><Para>* congenital anomaly</Para></ItemContent>
<ItemContent><Para>* other medically important event</Para></ItemContent>
</UnorderedList>
</Para>

 

I have this code below from the forum I've read but it could not get what I need when I use my test.xml.

_Example()

Func _Example()
    Local $oErrorHandler = ObjEvent("AutoIt.Error")
    Local $oXML = ObjCreate("Microsoft.XMLDOM")
    $oXML.setProperty("SelectionLanguage", "XPath")

    $oXML.load(@ScriptDir & "\Test.xml")
    Local $oSoftware_enum = Null
    Local $oSoftware_coll = $oXML.selectNodes("//BY/BT") ;here is the triger element that I need to get
    Local $oChilds_coll = Null
    For $oSoftware_enum In $oSoftware_coll
        $oChilds_coll = $oSoftware_enum.childNodes
        For $oChild_enum In $oChilds_coll
            ConsoleWrite("This spits out a glob of squished together info: " & $oChild_enum.text & @CRLF)
        Next
    Next
EndFunc   ;==>_Example

 

Just let me know if you need anything, Experts. Thank you in advance.☺️

 

KS15

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

Assuming input and output format are static.

I almost would just have a template to past into and take your source and strip everything in brackets <> then take each line and have it past in order to the template where you have the data positions ready.

This is just an example of the concept.  I did not work out all the kinks and I would probably write to a file rather than have the template in the code itself and use string replace with placeholder values.

#Include <File.au3>
#Include <Array.au3>

$aFile = FileReadtoArray("source.txt")
_ArrayDisplay($aFile)

For $i = 0 to UBound($aFile) -1
    $aFile[$i] = StringRegExpReplace($aFile[$i], "(.*?)<.*?>", "$1")
Next

_ArrayDisplay($aFile)

MsgBox(0, "", "" _
& '<Para Type="Attribute"><Emphasis Type="Bold">Information</Emphasis>' _
& '<UnorderedList Mark="None">' _
& '<Heading>' & $aFile[1] & '</Heading>' _
& '<ItemContent><Para>' & $aFile[1] & '</Para></ItemContent>' _
& '<ItemContent><Para>' & $aFile[2] & '</Para></ItemContent>' _
& '<ItemContent><Para>' & $aFile[3] & '</Para></ItemContent>' _
& '<ItemContent><Para>' & $aFile[4] & '</Para></ItemContent>' _
& '<ItemContent><Para>' & $aFile[5] & '</Para></ItemContent>' _
& '<ItemContent><Para>' & $aFile[6] & '</Para></ItemContent>' _
& '</UnorderedList>' _
& '</Para>')

 

Source.txt

Link to comment
Share on other sites

@ViciousXUSMC,

Thanks, it is working indeed if the content of the Test.xml is only these elements but my apology for luck of information, these elements are just a snippet from the whole XML elements found in our test.xml. See further below sample of whole xml to check.✌️😄

<BY><PN><SN>&NA;</SN></PN><BT>
<P>An event is serious (based on the ICH definition) when the patient outcome is&colon;</P><LS T="B"><LM>
<P>death</P></LM><LM>
<P>life-threatening</P></LM><LM>
<P>hospitalisation</P></LM><LM>
<P>disability</P></LM><LM>
<P>congenital anomaly</P></LM><LM>
<P>other medically important event</P></LM></LS></BT></BY>

 

This is what the whole XML looks like. I added "; ==>" in the line on what should I get from the Test.xml☺️. Note that the count or the list of <P> within the parent tag <BT> are changing, means they might only have 1 or 2 <P> or more than as the sample provided below. I only need to get the content from the parent element <BT> until it reaches to the close tag </BT>.🤤 Also, sometimes the Test.xml doesn't have these elements so it is "Optional". If these element exist then do the condition but if not exist then exit.

<!DOCTYPE dg SYSTEM "ovidbase.dtd">
<DG><D V="2000.3F" AN="0128415-201917710-00019">
<BB>
<TG><TI>Some title here..</TI><STI>Some title here..</STI></TG>
<BY><PN><SN>&NA;</SN></PN><BT>; ==> This is the start tag that I need to get
<P>An event is serious (based on the ICH definition) when the patient outcome is&colon;</P><LS T="B"><LM>
<P>death</P></LM><LM>
<P>life-threatening</P></LM><LM>
<P>hospitalisation</P></LM><LM>
<P>disability</P></LM><LM>
<P>congenital anomaly</P></LM><LM>
<P>other medically important event</P></LM></LS></BT></BY> ; ==> this is the end tag
<SO><PB>Some Text here...</PB><ISN>XXX-XXXX</ISN><DA><DY>XX</DY><MO>XXXX</MO><YR>XXXX</YR></DA><V>XXXX</V><IS><IP>XX</IP></IS><PG>XX</PG></SO>
<DT>Some heading here...</DT>
</BB>
<BD>
<LV1>
<P>Some sentences here....</P>
<P>Some sentences here....</P>
<P>Some sentences here....</P>
</LV1></BD>
<ED><EDS>
<HD>Some Heading here...</HD>
<RF ID="R1-19">1. <URF>Some reference list here...</URF></RF>
</EDS></ED>
<KWS><HD>Heading</HD><KW>Text here</KW><KW>Text here</KW><KW>Text here</KW><KW>Text here</KW><KW>Text here</KW></KWS>
</D></DG>

 

When I get the elements that I need from the Test.xml then I need to convert it with the below new element tag and paste it to my new XML. But for now, I need to get the elements from Test.xml and convert them like below.

<Para Type="Attribute"><Emphasis Type="Bold">Information</Emphasis>
<UnorderedList Mark="None">
<Heading>An event is serious (based on the ICH definition) when the patient outcome is:</Heading>
<ItemContent><Para>* death</Para></ItemContent>
<ItemContent><Para>* life-threatening</Para></ItemContent>
<ItemContent><Para>* hospitalisation</Para></ItemContent>
<ItemContent><Para>* disability</Para></ItemContent>
<ItemContent><Para>* congenital anomaly</Para></ItemContent>
<ItemContent><Para>* other medically important event</Para></ItemContent>
</UnorderedList>
</Para>

 

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

1 hour ago, KickStarter15 said:

Please can someone help me on this.😢

Since you are basically just extracting values from the original XML file and placing those values in different tags, you could probably just do a transofrm using XSLT.  The XML UDF has a _XML_Transform() function that will do the transformation.  If you don't know what XSLT is, then you can look it up and see if you want to spend the time to learn it.  Otherwise, you can either parse out the information manually using the XML UDF or using something like regular expressions and/or StringBetween functions.

If you are stuck on a particular step or part of YOUR task, then maybe you can ask a more specific question so that you can get past that hurdle and move on with YOUR project.  I would be happy to help you but you need to be a little more specific in terms of what part of your task is presenting a problem for you.  If your issue is that you don't understand XML, then you need to spend more time learning it.  If you have an AutoIt-related issue, I'm sure that I, or numerous others on the forum, can help.  If you are looking for someone to just give you a fully working solution to YOUR little mini project, then I'll leave that to someone else.

 

By the way, the XML that you posted isn't valid XML.  It needs to be cleaned up before you can use the xml UDF on it because I'm sure that you will get parsing errors when trying to load it.

Link to comment
Share on other sites

@TheXman, Thanks for your response. I already made my own way on how to read my XML source and get the XML content. From that code below, my unique XML elements can be read however still lots of converting that needs to be done.

$sNewData = @ScriptDir & "\Testing.xml"
$sXMLFile = @ScriptDir & "\Test.xml"

$sXML = FileRead($sXMLFile)
$avData = _StringBetween($sXML, "<BT>", "</BT>")
MsgBox(0,"", $avData[0])

$hXMLFile = FileOpen($sNewData, 2) ; 2 = Overwrite
FileWrite($hXMLFile, $avData[0])

 

Then doing some FileReadtoArray() to get my _ArrayToString() I can extract my XML node from Test.xml. Now, doing some small StringRegExpReplace() I can remove all those open "<" and close ">" tags from the extracted XML node and give me the result of

image.png.5bb40c668e97c454b0680b1bf0641241.png

 

Maybe from this point I can start my insertion of new elements like the one I posted in my first post and tag these entries.

27 minutes ago, TheXman said:

By the way, the XML that you posted isn't valid XML.  It needs to be cleaned up before you can use the xml UDF on it because I'm sure that you will get parsing errors when trying to load it.

Yup it's really an invalid XML because it's a unique source of XML, but I only need some data to read from that XML and convert that content into a valid XML (like the one in my first post).

 

I am still looking for a better way in solving this project.

 

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

As TheXman said if your xml is not a valid one I personally can see nothing but a brute way to parse it using regex or so, and then format the results as Vicious suggested

$txt = "<!DOCTYPE dg SYSTEM ""ovidbase.dtd"">" & @crlf & _ 
    "<DG><D V=""2000.3F"" AN=""0128415-201917710-00019"">" & @crlf & _ 
    "<BB>" & @crlf & _ 
    "<TG><TI>Some title here..</TI><STI>Some title here..</STI></TG>" & @crlf & _ 
    "<BY><PN><SN>&NA;</SN></PN><BT>; ==> This is the start tag that I need to get" & @crlf & _ 
    "<P>An event is serious (based on the ICH definition) when the patient outcome is&colon;</P><LS T=""B""><LM>" & @crlf & _ 
    "<P>death</P></LM><LM>" & @crlf & _ 
    "<P>life-threatening</P></LM><LM>" & @crlf & _ 
    "<P>hospitalisation</P></LM><LM>" & @crlf & _ 
    "<P>disability</P></LM><LM>" & @crlf & _ 
    "<P>congenital anomaly</P></LM><LM>" & @crlf & _ 
    "<P>other medically important event</P></LM></LS></BT></BY> ; ==> this is the end tag" & @crlf & _ 
    "<SO><PB>Some Text here...</PB><ISN>XXX-XXXX</ISN><DA><DY>XX</DY><MO>XXXX</MO><YR>XXXX</YR></DA><V>XXXX</V><IS><IP>XX</IP></IS><PG>XX</PG></SO>" & @crlf & _ 
    "<DT>Some heading here...</DT>" & @crlf & _ 
    "</BB>" & @crlf & _ 
    "<BD>" & @crlf & _ 
    "<LV1>" & @crlf & _ 
    "<P>Some sentences here....</P>" & @crlf & _ 
    "<P>Some sentences here....</P>" & @crlf & _ 
    "<P>Some sentences here....</P>" & @crlf & _ 
    "</LV1></BD>" & @crlf & _ 
    "<ED><EDS>" & @crlf & _ 
    "<HD>Some Heading here...</HD>" & @crlf & _ 
    "<RF ID=""R1-19"">1. <URF>Some reference list here...</URF></RF>" & @crlf & _ 
    "</EDS></ED>" & @crlf & _ 
    "<KWS><HD>Heading</HD><KW>Text here</KW><KW>Text here</KW><KW>Text here</KW><KW>Text here</KW><KW>Text here</KW></KWS>" & @crlf & _ 
    "</D></DG>"
 ;Msgbox(0,"", $txt)

#Include <Array.au3>

$heading = StringRegExpReplace($txt, '(?s).*<BT>.*?<P>([^<]+).*', "$1")
Msgbox(0,"heading", $heading)

$items_list = StringRegExpReplace($txt, '(?s).*<LS(.*?)</LS>.*', "$1")
$items = StringRegExp($items_list, '<P>([^<]+)', 3)
_ArrayDisplay($items)

 

Link to comment
Share on other sites

Thanks, @mikell... It's working as expected. Now I've got the extracted information and they are now converted like the below tag. My question is, how can I insert this new tag to the opened XML file or active XML file.

<Para Type="Attribute"><Emphasis Type="Bold">Information</Emphasis>
<UnorderedList Mark="None">
<Heading>An event is serious (based on the ICH definition) when the patient outcome is:</Heading>
<ItemContent><Para>* death</Para></ItemContent>
<ItemContent><Para>* life-threatening</Para></ItemContent>
<ItemContent><Para>* hospitalisation</Para></ItemContent>
<ItemContent><Para>* disability</Para></ItemContent>
<ItemContent><Para>* congenital anomaly</Para></ItemContent>
<ItemContent><Para>* other medically important event</Para></ItemContent>
</UnorderedList>
</Para>

My XML file is open in Notepad++ so I need to get the active XML path then insert the above tag like below expected output. Do you have any idea on how to do that? Please?😄 My trigger on where to insert my newly created XML tag is the element "<Body>" as you can see below encircled with green.

image.png.3fbdf59b3631dbbb19ec42fd3152d732.png

 

Edited by KickStarter15

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

Can't think of any solution right now on how to insert these new tag to an open XML file. I tried checking this in Batchscript code, doing something like this -x &quot;$(FULL_CURRENT_PATH)&quot; can get the current path of xml file but how can I do this in autoit Experts. Please can someone advise?

Thanks, Experts.

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

@Nine, Thanks, but can you explain what will be the expected output on your suggestion?

25 minutes ago, Nine said:

Local $aItem = StringRegExp(_StringBetween ($txt, "<BT>","</BT>",$STR_ENDNOTSTART)[0],'<P>([^<]+)', $STR_REGEXPARRAYGLOBALMATCH)

 

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

@Nine, Yup, got it. but is there a way to tag the below new generated tag to an open XML file?

<Para Type="Attribute"><Emphasis Type="Bold">Information</Emphasis>
<UnorderedList Mark="None">
<Heading>An event is serious (based on the ICH definition) when the patient outcome is:</Heading>
<ItemContent><Para>* death</Para></ItemContent>
<ItemContent><Para>* life-threatening</Para></ItemContent>
<ItemContent><Para>* hospitalisation</Para></ItemContent>
<ItemContent><Para>* disability</Para></ItemContent>
<ItemContent><Para>* congenital anomaly</Para></ItemContent>
<ItemContent><Para>* other medically important event</Para></ItemContent>
</UnorderedList>
</Para>

I need to insert the above XML tag into an open XML file like the below expected output.

image.png.5a5c10de7762cc04bf13623cedd46b5f.png

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

1 hour ago, KickStarter15 said:

I need to insert the above XML tag into an open XML file like

I don't understand your issue, since you are creating the output file yourself...

Link to comment
Share on other sites

@Nine, Maybe just a little bit confused.

In my above posts I've got already the information I need from the source XML which is the below screenshot.

image.png.67a83d1088eab03e3eec3ceaa65cbeb3.png

 

Now, that information from my source XML needs to be tag like the below sample and I've got it on how to tag the above information into this:

<Para Type="Attribute"><Emphasis Type="Bold">Information</Emphasis>
<UnorderedList Mark="None">
<Heading>An event is serious (based on the ICH definition) when the patient outcome is:</Heading>
<ItemContent><Para>* death</Para></ItemContent>
<ItemContent><Para>* life-threatening</Para></ItemContent>
<ItemContent><Para>* hospitalisation</Para></ItemContent>
<ItemContent><Para>* disability</Para></ItemContent>
<ItemContent><Para>* congenital anomaly</Para></ItemContent>
<ItemContent><Para>* other medically important event</Para></ItemContent>
</UnorderedList>
</Para>

 

My question is, can the above newly created tag can possibly inserted into the open XML file? my xml file is open using notepad++ and I need to insert these newly created tag into that opened XML file. Would that be possible?

 

Let me know if still not clear enough @Nine so that I can explain it to you further. ☺️ Thanks.

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

Hello Experts,

I have this below code that will insert into the XML file if the path was declared. But the condition that I need to do here is that the Output.xml is open in the notepad++ and I need to insert my new XML tags to the active/opened XML file.

#include<Array.au3>
#include<File.au3>
   $sFilePath = @ScriptDir & '\Testing.xml'
    Local $hFileOpen = FileOpen($sFilePath, $FO_READ)
    If $hFileOpen = -1 Then
        MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.")
        Return False
     EndIf
   Local $sFileRead = FileRead($hFileOpen)
   FileClose($hFileOpen)
_insertLineToFile(@ScriptDir & '\Output.xml', '<Body>', $sFileRead) ; Here's my concern $filePath, how can I get the path without declaring it to this line.

Func _insertLineToFile($filePath, $after, $insertText)
    Local $lines
    If Not _FileReadToArray($filePath, $lines) Then Return -1
    _ArrayInsert($lines, _ArraySearch($lines, $after, 1, 0, 0) + 1, $insertText)
    If _FileWriteFromArray($filePath, $lines, 1) Then Return 1
    Return -2
EndFunc   ;==>_insertLineToFile

To explain further, my "Testing.xml" has the below tagging.

<Para Type="Attribute"><Emphasis Type="Bold">Information</Emphasis>
<UnorderedList Mark="None">
<Heading>An event is serious (based on the ICH definition) when the patient outcome is:</Heading>
<ItemContent><Para>* death</Para></ItemContent>
<ItemContent><Para>* life-threatening</Para></ItemContent>
<ItemContent><Para>* hospitalisation</Para></ItemContent>
<ItemContent><Para>* disability</Para></ItemContent>
<ItemContent><Para>* congenital anomaly</Para></ItemContent>
<ItemContent><Para>* other medically important event</Para></ItemContent>
</UnorderedList>
</Para>

These tagging will be inserted to my Output.xml which is:

From this:

image.png.3c63fa07eb4b0d0755501e5410f7b6e0.png

 

To this:

image.png.53ed2f1e83d1a13e88c9cfc0dfc7d262.png

 

Now my problem is I could not get the path of my opened XML file to insert those tagging into my Output.xml. Is there anyone here that can help me find out on how to get the path of an open XML file?🙁 Please?????? Experts...

 

I have attached my sample "Testing.xml" and my sample "Output.xml" for your reference.

 

Testing.xml Output.xml

Edited by KickStarter15

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

I think I got it, Experts.

Doing some analyzation with some points, I manage to get the path of an open XML file by getting the "current full path to clipboard" in notepad++ command macro.

Now what I did is using ClipGet() to get the stored path in the clipboard. I have the below code and it is working fine as expected.😄

Send("!a") ; using shortcut keys
Local $sData = ClipGet() ; getting the stored clipboard path of active XML file 
Local $sFileOpenXML = $sData ; path
MsgBox(64, "XML Path", $sFileOpenXML) ; output string to get.
Exit

What do you think Experts, is this fine or is there any other way to get the opened XML file path.🤤

Edited by KickStarter15

Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...