Jump to content

Recommended Posts

Posted

Hello,

I'm creating a program to help me analyze stocks. So a big part of my tool is web scraping.

As of yesterday, the _INetGetSource command seemed to stop working for the page that tells me a stock's info which is on yahoo finance (I'll ref the link below). Here is the simplified version of the code.

#include <Inet.au3>

ConsoleWrite(_INetGetSource('https://finance.yahoo.com/quote/AAPL'))

It's strange because before yesterday, it was pulling the code from those pages correctly. The _INetGetSource will work for most other yahoo pages, even the finance home page (finance.yahoo.com) but not the page that shows me a specific stock's info.

Does anyone know why it stopped giving me the source code for those pages?ย 

Posted (edited)

Hi @BlazerV60,

how does your output look like? What do you search exactly?
This site uses iframes on several sections - this might be a problem.
ย 

6 hours ago, BlazerV60 said:

So a big part of my tool is web scraping.

Which tool(s) do you use for the web scraping?

Are you only interested in getting the whole page content (source) ? Then you parse the needed data?
I guess it could be better to only get the expected data directly (e.g. WebDriver (au3WebDriver project) or by UIA).

6 hours ago, BlazerV60 said:

Does anyone know why it stopped giving me the source code for those pages?ย 

How? Only the maintainer(s)/developer(s) of the page know if there were changes.

I am interested in helping you. So please provide more context ๐Ÿค .

Best regards
Sven

Edited by SOLVE-SMART

Stay innovative!

Spoiler

๐ŸŒย Au3Forums

๐ŸŽฒ AutoIt (en) Cheat Sheet

๐Ÿ“Š AutoIt limits/defaults

๐Ÿ’Ž Code Katas: [...] (comming soon)

๐ŸŽญ Collection of GitHub users with AutoIt projects

๐Ÿžย False-Positives

๐Ÿ”ฎย Me on GitHub

๐Ÿ’ฌย Opinion about new forum sub category

๐Ÿ“‘ย UDF wiki list

โœ‚ย VSCode-AutoItSnippets

๐Ÿ“‘ย WebDriver FAQs

๐Ÿ‘จโ€๐Ÿซย WebDriver Tutorial (coming soon)

Posted

The output comes out as weird characters like a symbol as shown in the attached image.

ย 

I'm trying to get the whole page content (source) and then parse the data I need.ย 

I'm just wondering if you know a way to get the source code for that page again. I'd do it the _IECreate /w _IEDocReadHTML method but that makes my tool run slower because it technically has to open a hidden browser and then take the source code from that and then close that hidden browser, but if there's no solution to the _InetGetSource way then i'll do it.

image.png

Posted

There are several questions not answered so far @BlazerV60. So I have to answer to this ...

7 hours ago, BlazerV60 said:

I'm just wondering if you know a way to get the source code for that page again.

... simply with yes, use the WebDriver or try to use UIA. Both can be found several times here at the forum by the search box.

7 hours ago, BlazerV60 said:

I'd do it the _IECreate /w _IEDocReadHTML method but that makes my tool run slower because it technically has to open a hidden browser and then take the source code from that and then close that hidden browser [...]

This would also be the case for using WebDriver or UIA too. But in headless mode it's not that bad (it's quick enough I would say).

Best regards
Sven

Stay innovative!

Spoiler

๐ŸŒย Au3Forums

๐ŸŽฒ AutoIt (en) Cheat Sheet

๐Ÿ“Š AutoIt limits/defaults

๐Ÿ’Ž Code Katas: [...] (comming soon)

๐ŸŽญ Collection of GitHub users with AutoIt projects

๐Ÿžย False-Positives

๐Ÿ”ฎย Me on GitHub

๐Ÿ’ฌย Opinion about new forum sub category

๐Ÿ“‘ย UDF wiki list

โœ‚ย VSCode-AutoItSnippets

๐Ÿ“‘ย WebDriver FAQs

๐Ÿ‘จโ€๐Ÿซย WebDriver Tutorial (coming soon)

Posted
6 hours ago, SOLVE-SMART said:

There are several questions not answered so far @BlazerV60. So I have to answer to this ...

... simply with yes, use the WebDriver or try to use UIA. Both can be found several times here at the forum by the search box.

This would also be the case for using WebDriver or UIA too. But in headless mode it's not that bad (it's quick enough I would say).

Best regards
Sven

Yeah but my tool usually looks at over 100 stocks on any given day so 100x the extra lag time on using _IECreate does make things a little slower but it looks like it's the only way for me to go about this.ย 

So I guess the reason why inetgetsource stopped working on the page I referenced is due to the page implementing iframes?

Posted

That's what I also though on the first look into the DOM structure @argumentum .

1 hour ago, BlazerV60 said:

Yeah but my tool usually looks at over 100 stocks on any given day so 100x the extra lag time on using _IECreate does make things a little slower but it looks like it's the only way for me to go about this.ย 

I understand this, but me guess is in case you would only scrape you target information instead of trying to get all of the page, it shouldn't be very slow.
You also can implement multiple instances of the chromedriver to do the scraping actions in "parallel".

Ones again, if you could specific which data you need from which page, we could possibly make other/better suggestions.
Besides that, give the au3WebDriver Project a chance. For a quick start I refer to this post.

Best regards
Sven

Stay innovative!

Spoiler

๐ŸŒย Au3Forums

๐ŸŽฒ AutoIt (en) Cheat Sheet

๐Ÿ“Š AutoIt limits/defaults

๐Ÿ’Ž Code Katas: [...] (comming soon)

๐ŸŽญ Collection of GitHub users with AutoIt projects

๐Ÿžย False-Positives

๐Ÿ”ฎย Me on GitHub

๐Ÿ’ฌย Opinion about new forum sub category

๐Ÿ“‘ย UDF wiki list

โœ‚ย VSCode-AutoItSnippets

๐Ÿ“‘ย WebDriver FAQs

๐Ÿ‘จโ€๐Ÿซย WebDriver Tutorial (coming soon)

Posted

Sure, I'm specifically only trying to grab the share price from the page, so right now it's showing around $166.

ย 

I'll look into the au3webdriver.

Posted (edited)

As a complete different approach:
Read this article that could be helpful - I don't know. => Usage of an API (yahoo finance) to get the information you want.
https://algotrading101.com/learn/yahoo-finance-api-guide/

๐Ÿ’ก I was just searching for "yahoo finance api" on google ... several API ideas.

Best regards
Sven

Edited by SOLVE-SMART

Stay innovative!

Spoiler

๐ŸŒย Au3Forums

๐ŸŽฒ AutoIt (en) Cheat Sheet

๐Ÿ“Š AutoIt limits/defaults

๐Ÿ’Ž Code Katas: [...] (comming soon)

๐ŸŽญ Collection of GitHub users with AutoIt projects

๐Ÿžย False-Positives

๐Ÿ”ฎย Me on GitHub

๐Ÿ’ฌย Opinion about new forum sub category

๐Ÿ“‘ย UDF wiki list

โœ‚ย VSCode-AutoItSnippets

๐Ÿ“‘ย WebDriver FAQs

๐Ÿ‘จโ€๐Ÿซย WebDriver Tutorial (coming soon)

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...