Jump to content

xml scraping


Go to solution Solved by mikell,

Recommended Posts

tired what you posted got an error

C:tfh.au3 (5) : ==> The requested action with this object has failed.:
$oMembers= $oXML.selectNodes('//availability/members[@date=' & @YEAR & "-" & @MON & "-" & StringRegExpReplace(@MDAY,"(0)(d+)","2") & ']')
$oMembers= $oXML.selectNodes('//availability/members[@date=' & @YEAR & "-" & @MON & "-" & StringRegExpReplace(@MDAY,"(0)(d+)","2") & ']')^ ERROR

Link to comment
Share on other sites

? (still regex way)

#Include <Array.au3>

$sXML = '<availability>' & @crlf & _
    '<members date="2014-06-6" count="2" day="2" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-7" count="6" day="5" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-8" count="8" day="4" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-9" count="9" day="9" night="1" OOA="0" na="0" />' & @crlf & _
    '</availability>'

$sXML = StringRegExpReplace($sXML, '(?is).*<availability(.*?)</availability.*', '$1')
Msgbox(0,"", $sXML)
$days = StringRegExp($sXML, '(?m).*date="([^"]+).*day="([^"]+).*\R?', 3)  
_ArrayDisplay($days)

$date = @year &"-"& @mon &"-"& Number(@mday)  ; today
For $i = 0 to UBound($days)-2
   If $days[$i] = $date Then Msgbox(0,"", $days[$i+1])
Next
Link to comment
Share on other sites

If you still want to use the IE objects, you could:

#include <IE.au3>
$sXML = '<availability>' & @crlf & _
    '<members date="2014-06-6" count="2" day="2" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-7" count="6" day="5" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-8" count="8" day="4" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-9" count="9" day="9" night="1" OOA="0" na="0" />' & @crlf & _
    '</availability>'
$oIE = _IECreate("about:blank", 0, 0)
_IEBodyWriteHTML($oIE, $sXML)
Local $Availability = _IETagNameGetCollection($oIE, 'availability', 0)
if IsObj($Availability) Then
    Local $members = _IETagNameGetCollection($Availability, "members")
    if IsObj($members) Then
        ConsoleWrite("----------------" & @CRLF)
        for $member in $members
            ConsoleWrite(">> date: " & $member.getAttribute("date") & @CRLF)
            ConsoleWrite(">> count: " & $member.getAttribute("count") & @CRLF)
            ConsoleWrite(">> day: " & $member.getAttribute("day") & @CRLF)
            ConsoleWrite(">> night: " & $member.getAttribute("night") & @CRLF)
            ConsoleWrite(">> OOA: " & $member.getAttribute("OOA") & @CRLF)
            ConsoleWrite(">> na: " & $member.getAttribute("na") & @CRLF)
            ConsoleWrite("----------------" & @CRLF)
        Next
    EndIf
EndIf
_IEQuit($oIE)

or to get only today's "day":

#include <IE.au3>
$sXML = '<availability>' & @crlf & _
    '<members date="2014-06-6" count="2" day="2" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-7" count="6" day="5" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-8" count="8" day="4" night="1" OOA="0" na="0" />' & @crlf & _
    '<members date="2014-06-9" count="9" day="9" night="1" OOA="0" na="0" />' & @crlf & _
    '</availability>'
$oIE = _IECreate("about:blank", 0, 0)
_IEBodyWriteHTML($oIE, $sXML)
Local $Availability = _IETagNameGetCollection($oIE, 'availability', 0)
if IsObj($Availability) Then
    Local $members = _IETagNameGetCollection($Availability, "members")
    if IsObj($members) Then
        ConsoleWrite("-------------------" & @CRLF)
        for $member in $members
            $splDate = StringSplit($member.getAttribute("date"), "-")
            if ($splDate[0] >= 3) Then
                if (Number($splDate[1]) = @YEAR AND Number($splDate[2]) = @MON AND Number($splDate[3]) = @MDAY) Then
                    ConsoleWrite('>> today''s "day" is: ' & $member.getAttribute("day") & @CRLF)
                    ExitLoop
                EndIf
            EndIf
        Next
    EndIf
EndIf
_IEQuit($oIE)

However, I would advise you to use xmldom like jdelaney suggested.

Edited by dragan
Link to comment
Share on other sites

all examples are great thus far but there all making an xml. i need to scrape of the net not local file.

i contacted the site admin so to make easier he has added a daytag so maybe instead of looking for a date i can make script look for "today".

thanks everyone so far

<availability>

<members date="2014-06-07" daytag="Today" count="10" day="7" night="5" OOA="0" na="2"/>

<members date="2014-06-08" daytag="Tomorrow" count="9" day="7" night="6" OOA="0" na="1"/>

<members date="2014-06-09" daytag="Mon 9 Jun" count="7" day="6" night="7" OOA="0" na="0"/>

Edited by shaggy89
Link to comment
Share on other sites

thanks to all that helped.

after having a cup of coffee i looked at @mikell's post

$day = StringRegExpReplace($sXML, '(?is).*<availability.*?day="([^"]+).*</availability.*', '$1')  ; gets the first one (2)
Msgbox(0,"day", $day)

and used this as the coment states "gets the first one" well in the end that's all i needed was the top one

so now i have made it look like

local $Num = IniRead("settings.ini", "Number", "day", "")
local $Site = IniRead("settings.ini", "Site", "web", "")
$sXML = BinaryToString(InetRead($Site))
$day = StringRegExpReplace($sXML, '(?is).*<availability.*?day="([^"]+).*</availability.*', '$1')  ; gets the first one (2)

if $day <= $Num Then

again THANKS everyone

and this is the perfect example why i use this forum.

shane

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...