Jump to content

This is more of an HTML question


queensoft
 Share

Recommended Posts

I'm trying to download some webpages. Example: https://www.imdb.com/title/tt6238614/episodes/?year=2017

Using regular InetRead:

$sim01 = BinaryToString(InetRead('https://www.imdb.com/title/tt6238614/episodes/?year=2017', 1),4)    ; force reload, UTF

Or custom function:

Func _HTTP_Download($URL)
   $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
   $oHTTP.Open("GET", $URL)
   $oHTTP.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/117.0")
   $oHTTP.SetRequestHeader("Accept-Language", "en")
   $oHTTP.SetRequestHeader("Content-type", "text/html; charset=utf-8")
   $oHTTP.Send()
   return $oHTTP.ResponseText
EndFunc

The problem is that it doesn't load the complete page - only the first 50 episodes.

Just like the browser - it doesn't show all episodes at once. You have to manually click 50 more or All buttons, below the list.

Image1.thumb.jpg.b2fd9e62361c9fa2c8ed80c210b691c2.jpg

If I use Inspect, I get this:

Image2.thumb.jpg.116f7fa35f955ec00a6475c1b232f94d.jpg

Is there a way to load entire page at once, using pure AutoIt?

I know I can do it using Chrome + Webdriver, but it's much slower and less user firiendly.

Thank you.

Link to comment
Share on other sites

9 hours ago, queensoft said:

Is there a way to load entire page at once, using pure AutoIt?

I know I can do it using Chrome + Webdriver, but it's much slower and less user firiendly.

Thank you.

No. Because if the data it's loaded dynamically it's not actually in the DOM yet. You need to IE UDF or Webdriver or whatever you want to make the server to get your data in the DOM.

Edited by Andreik

When the words fail... music speaks.

Link to comment
Share on other sites

Yes, I know the data is not present right from the beginning.

I was wondering if there's a way to load it, for instance to craft a custom link / action, based on the HTML code from the page source code.

There are ways to do this, I'm pretty sure this is not using Chrome + webdriver:  https://rapidapi.com/apidojo/api/imdb8/

Endpoint: title/get-seasons

Result:

Image1.jpg.19d767229e58a0f456b15fe9f80269d9.jpg

[
  {
    "episodes": [
      {
        "episode": 1,
        "id": "/title/tt6242172/",
        "season": 1,
        "title": "Episode #1.1",
        "titleType": "tvEpisode",
        "year": 2016
      },
      {
        "episode": 2,
        "id": "/title/tt6242174/",
        "season": 1,
        "title": "Episode #1.2",
        "titleType": "tvEpisode",
        "year": 2016
      },
      {
        "episode": 3,
        "id": "/title/tt13694412/",
        "season": 1,
        "title": "Episode #1.3",
        "titleType": "tvEpisode",
        "year": 2016
      },
      {
        "episode": 4,
        "id": "/title/tt13704148/",
        "season": 1,
        "title": "Episode #1.4",
        "titleType": "tvEpisode",
        "year": 2016
      },

 

Link to comment
Share on other sites

4 minutes ago, queensoft said:

I was wondering if there's a way to load it, for instance to craft a custom link / action, based on the HTML code from the page source code.

I suppose there is an API that allows you to get all the information but it's beyond AutoIt forum's scope to deal with all particular APIs. I think there is a IMDB UDF, just search for it.

When the words fail... music speaks.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...