rudi Posted April 29, 2019 Share Posted April 29, 2019 (edited) Hello, doing some initial search I found, that there is a XML.AU3 UDF available: The XML data are coming from a web services interface of a PLC (retrieving these data from http://10.27.20.101:8080/user/errors is no problem). A "no-error-situation" result will look like this: <eta version="1.0"> <errors uri="/user/errors"> <fub uri="/25/10341" name="Kessel2"/> <fub uri="/25/10241" name="Sys2"/> <fub uri="/26/10301" name="Kessel"/> <fub uri="/24/10341" name="Kessel1"/> <fub uri="/24/10241" name="Sys1"/> <fub uri="/33/10361" name="Asche2"/> <fub uri="/32/10361" name="Asche1"/> <fub uri="/123/10251" name="Puffer"/> <fub uri="/123/10241" name="AF"/> </errors> </eta> An example error XML data set I received from the vendor looks like this: <eta xmlns="http://www.eta.co.at/rest/v1" version="1.0"> <errors uri="/user/errors"> <fub uri="/121/10441" name="Raum 1.1"> <error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02"> Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung </error> </fub> <fub uri="/120/10601" name="PufferFlex"/> <fub uri="/120/10111" name="WW"/> <fub uri="/120/10101" name="HK"/> <fub uri="/120/10481" name="ER-HK"/> </errors> </eta> the error node url for explicitely /121/10441 is returning this partial XML content: <eta xmlns="http://www.eta.co.at/rest/v1" version="1.0"> <errors uri="/user/errors/121/10441"> <fub uri="/121/10441" name="Raum 1.1"> <error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02"> Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung </error> </fub> </errors> </eta> The data I need to extract from the error examle would be: $Location="Raum 1.1" $Device="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" $Status="Fehler" $Time="2019-04-18 12:36:02" $Message="Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung" What's the best aproach to grab error information only from the XML content returned as shown above? TIA, Rudi. Edited April 29, 2019 by rudi Earth is flat, pigs can fly, and Nuclear Power is SAFE! Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted April 29, 2019 Share Posted April 29, 2019 @rudi You could use a SRE to obtain the fields from you XML file, and then arrange them in this way: #include <Array.au3> #include <StringConstants.au3> Global $strFileContent = '<fub uri="/121/10441" name="Raum 1.1">' & @CRLF & _ '<error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02">' & _ 'Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung' & @CRLF & _ '</error>' & @CRLF & _ '</fub>', _ $arrResult, _ $arrMessages[5] = ["Location", "Device", "Status", "Time", "Message"], _ $arrArray[0][2] $arrResult = StringRegExp($strFileContent, '(?s)<fub uri="[^"]+" name="([^"]+)">.*?<error msg="([^"]+)" priority="([^"]+)" time="([^"]+)">([^<]+)<\/error>', $STR_REGEXPARRAYGLOBALMATCH) If IsArray($arrResult) Then For $i = 0 To UBound($arrResult) - 1 Step 1 ; The StringReplace() is used to replace the @CRLF, which is the delimiter for the Row in _ArrayAdd, with a @CR _ArrayAdd($arrArray, $arrMessages[$i] & "|" & StringReplace($arrResult[$i], @CRLF, @CR)) If @error Then Exit ConsoleWrite("_ArrayAdd() ERR: " & @error & @CRLF) Next _ArrayDisplay($arrArray) EndIf Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
rudi Posted April 29, 2019 Author Share Posted April 29, 2019 @FrancescoDiMuro Wow! Standing ovations!!! I tried to figure out some RegEx as well before, but despartely failed! $RegEx='(?s)<fub uri="[^"]+" name="([^"]+)">.*?<error msg="([^"]+)" priority="([^"]+)" time="([^"]+)">([^<]+)<\/error>' Tx! Rudi. Earth is flat, pigs can fly, and Nuclear Power is SAFE! Link to comment Share on other sites More sharing options...
mikell Posted April 29, 2019 Share Posted April 29, 2019 For the fun... #Include <Array.au3> $txt = '<eta xmlns="http://www.eta.co.at/rest/v1" version="1.0">' & @crlf & _ ' <errors uri="/user/errors/121/10441">' & @crlf & _ ' <fub uri="/121/10441" name="Raum 1.1">' & @crlf & _ ' <error msg="Sicherung VE-C 0 (angeschlossen an GM-C 1) - F1 Zuleitung" priority="Fehler" time="2019-04-18 12:36:02">' & @crlf & _ ' Sicherung am Ventilcontroller 0 defekt oder keine 230VAC-Spannungsversorgung' & @crlf & _ ' </error>' & @crlf & _ ' </fub>' & @crlf & _ ' </errors>' & @crlf & _ '</eta>' ; Msgbox(0,"", $txt) $res = StringRegExp($txt, '(?|(?:name|msg|priority|time)="([^"]*)|(?m)^\h*([^<]+?)$)', 3) _ArrayDisplay($res) FrancescoDiMuro 1 Link to comment Share on other sites More sharing options...
FrancescoDiMuro Posted April 29, 2019 Share Posted April 29, 2019 @mikell You lovely cat! Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette Link to comment Share on other sites More sharing options...
mikell Posted April 29, 2019 Share Posted April 29, 2019 Link to comment Share on other sites More sharing options...
rudi Posted April 29, 2019 Author Share Posted April 29, 2019 Impressing. Even with this Explanation I don't get all of it: Earth is flat, pigs can fly, and Nuclear Power is SAFE! Link to comment Share on other sites More sharing options...
mikell Posted April 29, 2019 Share Posted April 29, 2019 The regex101 explanations are very clear. Which part don't you get ? Link to comment Share on other sites More sharing options...
rudi Posted April 30, 2019 Author Share Posted April 30, 2019 (edited) The alternatives are familiar to me, as well as [non] capturing groups. But what's a "Branch Reset Group" doing exactly? and the 2nd alternative, hm... (?m) = ^ and $ match beginning and end of line (not full string) --> that's not fact for the 1st alternative? :hm: \h* = "kick leading WS, if any" (not in a capturing Group) ??? [^<]+? = match all but "<" (lazy) Explanation: It's telling to match literally "<", where is that one specified? (I don't see a literally "<") and I do not get how the result Array is populated <edit> I'm pretty Close to get your regex with the help of this Explanation. https://www.regular-expressions.info/branchreset.html The Population of the Array isn't clear to me at all, still... Edited April 30, 2019 by rudi Earth is flat, pigs can fly, and Nuclear Power is SAFE! Link to comment Share on other sites More sharing options...
mikell Posted April 30, 2019 Share Posted April 30, 2019 (edited) For ?| the helpfile says : "Non-capturing group with reset. Resets capturing group numbers in each top-level alternative it contains" Practically, this means that you won't get an unwanted blank line in the resulting array in case of matching failure of the first part of the alternation. You can check this by replacing ?| by ?: For the 2nd alternative :(?m) allows ^ and $ to "match beginning and end of line (not full string)". This feature is used in the 2nd alternative only, reason why I put it there - but it could be placed at the beginning of the expression as well In usual language, (?m)^\h*([^<]*?)$ means : Between beginning and end of line (anchors are important to force check of the whole line), match 0 or more horizontal WS (don't capture) and 0 or more "non-< " chars (capture). So if there is a < char in the line, it causes the whole match to fail Edited April 30, 2019 by mikell FrancescoDiMuro 1 Link to comment Share on other sites More sharing options...
rudi Posted May 3, 2019 Author Share Posted May 3, 2019 thanks! Earth is flat, pigs can fly, and Nuclear Power is SAFE! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now