Jump to content

How can I get the entire code of a website page?


Go to solution Solved by Danp2,

Recommended Posts

Greetings to all forum members! 

Can you please tell me how I can get the entire site page code? 

Only part of it is extracted. Ultimately I will need to get the value from: alert_temperature-DS-EntryPoint1-1. Shown in the screenshot.

But I still can’t get the contents of the entire page code in any way.

$oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
$oHTTP.Open("GET","https://www.msn.com/en-us/weather/maps/temperature/in-Fastov,Kyiv")
$oHTTP.Send()
$HTTPSource = $oHTTP.Responsetext
ConsoleWrite($HTTPSource)

I would be very grateful for any help, options and tips.

MSN Weather - Google Chrome_20240115235223.jpg

Link to comment
Share on other sites

The problem with using the WinHttp method is that it will only retrieve the base source of the webpage. Unless the site is only using static source code, it won't give your the complete source code that you desire because most sites are dynamic today. There are tools that should be able to download the complete source into a folder for you to manipulate afterwards, or you could likely accomplish this using the Webdriver UDF.

Link to comment
Share on other sites

1 hour ago, ioa747 said:

C:\Program Files (x86)\AutoIt3\Examples\COM\getHTMLsource.au3

Thank you. Tried this option - it also doesn't extract all the code.

44 minutes ago, Danp2 said:

There are tools that should be able to download the complete source into a folder for you to manipulate afterwards

If you can, please tell me such tools.

46 minutes ago, Danp2 said:

or you could likely accomplish this using the Webdriver UDF.

Yes, I've already watched this UDF. It is quite difficult to use. But the worst thing is that it is tied to a specific browser and you also need to install an additional driver for it. 

I would really like to find an option that is at least a little simpler.

Link to comment
Share on other sites

Google is your friend for this. You may also want to review this thread to see if it works for your situation.

Webdriver is more complicated, but it should be able to give you the results you need. I disagree about it being browser specific. Your code should work with any supported browser as long as your initial setup is correct. You do need an intermediate driver for classic webdriver. The newer BiDi protocol works without this requirement.

Link to comment
Share on other sites

10 minutes ago, Danp2 said:

 You may also want to review this thread to see if it works for your situation.

I’ve probably already re-read all possible topics related to my question, including this entire topic, and more than once. 

So you are suggesting trying to use curl as stated in this post? But it is used there for authorization, and not for retrieving data on a site page, which is generated using JavaScript. As far as I understood correctly.

30 minutes ago, Danp2 said:

Webdriver is more complicated, but it should be able to give you the results you need.

I would like to find at least some example of using UDF Webdriver in which data generated using JavaScript is read on the site.

Link to comment
Share on other sites

  • Solution

It isn't complicated using WD_Demo. Here's a modified version of the UserTesting function that demonstrates the minimal code required --

Func UserTesting()
    ; if necessary, you can modify the following function content by replacing, adding any additional function required for testing within this function
    Local $vResult
    $vResult = _WD_Navigate($sSession, 'https://www.msn.com/en-us/weather/maps/temperature/in-Fastov,Kyiv')
    If @error Then Return SetError(@error, @extended, $vResult)

    $vResult = _WD_LoadWait($sSession, 10, Default, Default, $_WD_READYSTATE_Interactive)
    If @error Then Return SetError(@error, @extended, $vResult)

    ; Pause to allow page to fully load
    ; You could use _WD_WaitElement here instead
    Sleep(5000)

    $sHTML = _WD_GetSource($sSession)
    If @error Then Return SetError(@error, @extended, $vResult)

    FileWrite(@ScriptDir & "\source.html", $sHTML)

EndFunc   ;==>UserTesting

 

Link to comment
Share on other sites

Danp2, thank you for the example! I will start with it and try to further solve my problem using Webdriver

But if anyone else has a simpler solution to my problem, please share your ideas and solutions. Thank you in advance!

Link to comment
Share on other sites

Danp2, I want to thank you for your advice on using the UDF Webdriver and for your code example! Based on it, I created a script that retrieves the data I need over a certain period! It turned out that this UDF is not so complicated and scary. Special thanks to you for developing it! 

But at the moment there is one significant problem with the script. After putting the computer into hibernation mode and returning it to a working state, the script stops working and receiving data from the site. Please tell me how to get the Webdriver to work after hibernation?

Link to comment
Share on other sites

These are the errors that constantly appear in the editor console window after exiting hibernation mode:

__WD_Post ==> Send / Recv error [6] : HTTP status = 0 ResponseText=WinHTTP request timed out before Webdriver

 

 

 

Link to comment
Share on other sites

Yes, the webdriver continues to work all the time. But there is a difference in the contents of its window: 

Here's before going into hibernation:

image.png.c35e52d5f53012f465b9d7f0f9b93b7f.png

But after recovery from hibernation:

image.png.f7123770d180b8ca77098e621167613c.png

Link to comment
Share on other sites

Danp2, hello! Are there any solutions or ideas on how to solve this problem? Maybe you need to check something again after hibernation and restart it in the WebDriver functions? Please tell me.

Link to comment
Share on other sites

Perhaps my experience report will be helpful:

Antivirus software: Bitdefender. When updating the software or an autoscann,
the connection in the script to the webdriver is lost 
(unfortunately I have this with some things, Outlook object is no longer found | IP connection that was established can no longer connect | or the script is simply stopped) although nothing can be seen in the Bitdefener log.

Hibernation - I have not tried this at all as some of my scripts have to run 24/7.

Logged into Windows, but on the lock screen, there are no problems.

When trying to restart the same WebDriver with the same data, the connection to the script is lost or errors keep occurring.

And now comes the funny part:
If I just test the .au3 file without compiling it,
then all of the above points do not apply to my tests!

Except for the last point. 

Maybe you are also using Antivirus software that does something in the background after the PC wakes up that only harms your productivity. 

you can ignore the device messages in the webdriver console window if you are not using usb devices.
This is displayed depending on how you have called up the webdriver.

These messages can also be switched off somehow.

 

 

Edited by MojoeB
Link to comment
Share on other sites

Thanks for your participation, MojoeB! I checked all the nuances you mentioned. I compiled the script and ran it. I added to the exceptions for scanning the entire folder where the program is located and launched. Nothing helps - after hibernation the program still stops working - it does not receive any data from the web driver, although it remains in the list of running processes.

Link to comment
Share on other sites

Could you please check if AutoIt still works when the computer goes to sleep, unfortunately I can't do it at work.
If so, could you test it with this code on WMI and then reply accordingly

#include <MsgBoxConstants.au3>

; WMI-Objekt erstellen
$objWMIService = ObjGet("winmgmts:\\.\root\cimv2")

; Überprüfen, ob das Objekt erstellt wurde
If IsObj($objWMIService) Then
    ; WMI-Abfrage für Energieverwaltungsereignisse
    $colItems = $objWMIService.ExecQuery("SELECT * FROM Win32_PowerManagementEvent")

    ; Prüfen, ob die Abfrage erfolgreich war
    If IsObj($colItems) Then
        $awakeFromSleep = False

        For $objItem In $colItems
            ; Überprüfen, ob das Ereignis das Aufwachen aus dem Ruhezustand ist
            If $objItem.EventType = 7 Then ; 7 entspricht dem Aufwachen aus dem Ruhezustand
                $awakeFromSleep = True
                ExitLoop
            EndIf
        Next

        If $awakeFromSleep Then
            MsgBox($MB_SYSTEMMODAL, "Status", "Der Computer ist kürzlich aus dem Ruhezustand erwacht.")
        Else
            MsgBox($MB_SYSTEMMODAL, "Status", "Keine Ereignisse zum Aufwachen aus dem Ruhezustand gefunden.")
        EndIf
    Else
        MsgBox($MB_SYSTEMMODAL, "Fehler", "Fehler bei der Ausführung der WMI-Abfrage.")
    EndIf
Else
    MsgBox($MB_SYSTEMMODAL, "Fehler", "Fehler beim Erstellen des WMI-Objekts.")
EndIf

 

Edited by MojoeB
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...