Jump to content

Webdriver - how to switch to another tab session and do web scraping


Recommended Posts

Hi Web Driver AutoIT experts,

 

I would like to ask your kind help if we have something similar to python's script of switching from main browser to another tab browser and apply the web scraping on that session.

I tried below but seems like I am missing something.

local $ohandle = _WD_Window($sSession,"handles")

Python Script that I would like to have same functionality with AutoIT.

browser.switch_to.window(browser.window_handles[1])

 

Appreciate for our inputs on this request.

 

Thank you

Link to comment
Share on other sites

Would doing something like this allow me to have a script scrape a site concurrently with multiple tabs? So if I have one tab hit a page that takes 30 seconds to scrape, I can hit 10 different pages in one function and it takes the same 30 seconds (taking into consideration PC load). Just making sure this isn't a qued functionality. 

Link to comment
Share on other sites

I don't know if I should start a new thread for this or if it's along the same lines so I apologize in advance if I'm hijacking. I'm not exactly scraping, so I'm not waiting for data to come back. I'm generating webp files so I just need the page to launch and go so the server can do what it needs to. So:

Load Session -> Load Tab1 -> Navigate1 -> Load Tab2 -> Navigate2 -> Load Tab3.........

Does any of that wait for a response or will it immediately go to the next function? Am I able to do what I imagine?

 

Link to comment
Share on other sites

Hi Champak,

 

That is possible as long as you can manipulate the flow of the process.

You can do a logic that it will sleep for some few seconds until the element is found before you will proceed with next step.

But if the page that you will open shall remain open then it will cause a huge memory usage that may result to hang-up of your machine.

Link to comment
Share on other sites

So it's not working the way I expect it to. I have all the timeouts set to 0. I expect the navigation to return immediately and not wait for the page to load so I can start the next navigation "immediately" so all of them would be loading pretty much at the same time. Am I doing this correctly, or is this expected and is what you meant by the multithreading issue?

#include "wd_helper.au3"
#include "wd_capabilities.au3"


Global $sDesiredCapabilities = SetupChrome(0)

_WD_Startup()
If @error <> $_WD_ERROR_Success Then MsgBox(0,"Error with chrome startup.",@error)

$sSession = _WD_CreateSession($sDesiredCapabilities)
;ConsoleWrite("Session: " & $sSession & "  -  " & @ScriptLineNumber & @CRLF)

_WD_LoadWait($sSession, 0, 0)

_WD_NewTab($sSession, Default, 0, "https://secure1.inmotionhosting.com/index/login")
;_WD_Navigate($sSession, "https://secure1.inmotionhosting.com/index/login")

_WD_NewTab($sSession, Default, 0, "https://cnn.com")
;_WD_Navigate($sSession, "https://cnn.com")

_WD_NewTab($sSession, Default, 0, "https://msn.com")
;_WD_Navigate($sSession, "https://msn.com")

_WD_DeleteSession($sSession)
_WD_Shutdown()


Func SetupChrome($Headless = 1, $Mobile = 0)

    _WD_Option('Driver', 'chromedriver.exe')
    _WD_Option('Port', 9515)
    _WD_Option('DriverParams', '--verbose --log-path="' & @ScriptDir & '\chrome.log"')
    _WD_CapabilitiesStartup()
    _WD_CapabilitiesAdd('alwaysMatch', 'chrome')
    _WD_CapabilitiesAdd('w3c', True)
    _WD_CapabilitiesAdd('args', "user-data-dir=C:\\Users\\" & @UserName & "\\AppData\\Local\\Google\\Chrome\\User Data\\Profile 1")
    If $Headless = 1 Then _WD_CapabilitiesAdd('args', '--headless')
    If $Mobile = 1 Then _WD_CapabilitiesAdd("mobileEmulation", "deviceName", 'Galaxy S9+')
    _WD_CapabilitiesAdd('excludeSwitches', 'enable-automation')
    _WD_CapabilitiesDump(@ScriptLineNumber) ; dump current Capabilities setting to console - only for testing in this demo
    Local $sDesiredCapabilities = _WD_CapabilitiesGet()
    Return $sDesiredCapabilities

EndFunc   ;==>SetupChrome

 

Edited by Champak
Link to comment
Share on other sites

The timeout in_WD_NewTab doesn't impact navigation. It's simply the amount of time that the function will loop while waiting for the new tab to appear, and this is only applicable when $sFeatures is supplied.

As mentioned above, try setting PageLoadStrategy to None --

_WD_CapabilitiesAdd('PageLoadStrategy', 'none')

This will allow _WD_Navigate to return to the calling routine before the webpage has actually loaded.

Link to comment
Share on other sites

If you check the logs, you will likely see the following --

Quote

     _WD_CapabilitiesAdd ==> Browser or feature not supported [21] : Capability name are case sensitive ( check proper case in $_WD_KEYS__*** for "PageLoadStrategy").      $key = PageLoadStrategy     $value1 = none     $value2 =

The first letter of PageLoadStrategy needs to be lowercase for it to work correctly.

P.S. Some times my attempts to assist actually lead people astray! 😄

Link to comment
Share on other sites

1 hour ago, Danp2 said:

This will allow _WD_Navigate to return to the calling routine before the webpage has actually loaded.

unless you intentionally use:

_WD_LoadWait($sSession)

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

Spoiler

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. 

My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST APIErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 *

 

My contribution to others projects or UDF based on  others projects: * _sql.au3 UDF  * POP3.au3 UDF *  RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane

Useful links: * Forum Rules * Forum etiquette *  Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * 

Wiki: Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * 

OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX

IE Related:  * How to use IE.au3  UDF with  AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskSchedulerIE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related:How to get reference to PDF object embeded in IE * IE on Windows 11

I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions *  EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *

I also encourage you to check awesome @trancexx code:  * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuffOnHungApp handlerAvoid "AutoIt Error" message box in unknown errors  * HTML editor

winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/

"Homo sum; humani nil a me alienum puto" - Publius Terentius Afer
"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming"
:naughty:  :ranting:, be  :) and       \\//_.

Anticipating Errors :  "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty."

Signature last update: 2023-04-24

Link to comment
Share on other sites

14 hours ago, Danp2 said:

@mLipok Not sure why you quoted me. Were you trying to imply that my statement wasn't accurate?  :unsure:

sorry my previous statement was not accurate.

Here is revised statement:
 

 

18 hours ago, Danp2 said:

This will allow _WD_Navigate to return to the calling routine before the webpage has actually loaded.

From my experience your statement should be extended that it works how you described but unless you intentionally use:

_WD_LoadWait($sSession)


am I right ?
EDIT: Please verify my knowledge in this regards.

btw.
When I write previous statement I was tired (over worked) thus I missed some kind of interpersonal/kidness statement that should be added at start.
I was too direct, sorry.

Edited by mLipok

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

Spoiler

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. 

My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST APIErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 *

 

My contribution to others projects or UDF based on  others projects: * _sql.au3 UDF  * POP3.au3 UDF *  RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane

Useful links: * Forum Rules * Forum etiquette *  Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * 

Wiki: Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * 

OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX

IE Related:  * How to use IE.au3  UDF with  AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskSchedulerIE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related:How to get reference to PDF object embeded in IE * IE on Windows 11

I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions *  EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *

I also encourage you to check awesome @trancexx code:  * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuffOnHungApp handlerAvoid "AutoIt Error" message box in unknown errors  * HTML editor

winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/

"Homo sum; humani nil a me alienum puto" - Publius Terentius Afer
"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming"
:naughty:  :ranting:, be  :) and       \\//_.

Anticipating Errors :  "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty."

Signature last update: 2023-04-24

Link to comment
Share on other sites

1 hour ago, mLipok said:

am I right ?

Technically, I would say you are incorrect because the call to _WD_Navigate has completed and returned the calling routine before any possibly execution of _WD_LoadWait (or any other AutoIt command) could occur.

In the current scenario, it wouldn't make sense to include _WD_LoadWait following the _WD_Navigate because the OP is wanting to launch each tab and have them loading simultaneously. IMO, it would have made more sense if you has said "Be sure not to call _WD_LoadWait following _WD_Navigate".

P.S. Thanks for the clarification and no need to apologize. This method of communication can often lead to misunderstandings, which is why I asked for further details from you. 🙂

Link to comment
Share on other sites

33 minutes ago, Danp2 said:

In the current scenario, it wouldn't make sense to include _WD_LoadWait following the _WD_Navigate because the OP is wanting to launch each tab and have them loading simultaneously. IMO, it would have made more sense if you has said "Be sure not to call _WD_LoadWait following _WD_Navigate".

I understood that @TheOne23 main intention is like this:
 

Quote

At start:

  • Create a TAB
  • Navigate to the page - but don't even check if anything has already loaded, i.e. no response from the browser
  • Repeat the previous steps as many times as you need to have pre-opened pages in separate tabs

Now just follow the next steps:

  • Go to the desired tab
  • Check out the content - starting with _WD_Navigate
  • Save the results
  • Initialize next search on the same page - but don't wait for the results
  • Follow the steps above for each next tab
  • Follow the steps above for subsequent inputs (source data)

 

@TheOne23 please confirm that my reasoning fits your needs or not ?
 

@Danp2 Is your understanding of the questioner's intention the same as mine?

 

Edited by mLipok

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

Spoiler

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. 

My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST APIErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 *

 

My contribution to others projects or UDF based on  others projects: * _sql.au3 UDF  * POP3.au3 UDF *  RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane

Useful links: * Forum Rules * Forum etiquette *  Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * 

Wiki: Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * 

OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX

IE Related:  * How to use IE.au3  UDF with  AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskSchedulerIE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related:How to get reference to PDF object embeded in IE * IE on Windows 11

I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions *  EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *

I also encourage you to check awesome @trancexx code:  * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuffOnHungApp handlerAvoid "AutoIt Error" message box in unknown errors  * HTML editor

winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/

"Homo sum; humani nil a me alienum puto" - Publius Terentius Afer
"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming"
:naughty:  :ranting:, be  :) and       \\//_.

Anticipating Errors :  "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty."

Signature last update: 2023-04-24

Link to comment
Share on other sites

6 minutes ago, mLipok said:

I understood that @TheOne23 main intention is like this:

btw.
I have exact such solution made for my clients, to extract data from several websites in one browser in many tabs at once.
So I already have some experience with it.
But so far I've always waited for a given website content to be fully loaded in a given tab.

Edited by mLipok

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

Spoiler

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. 

My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST APIErrorLog.au3 UDF - A logging Library * Include Dependency Tree (Tool for analyzing script relations) * Show_Macro_Values.au3 *

 

My contribution to others projects or UDF based on  others projects: * _sql.au3 UDF  * POP3.au3 UDF *  RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane

Useful links: * Forum Rules * Forum etiquette *  Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * 

Wiki: Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * 

OpenOffice/LibreOffice/XLS Related: WriterDemo.au3 * XLS/MDB from scratch with ADOX

IE Related:  * How to use IE.au3  UDF with  AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskSchedulerIE Embedded Control Versioning (use IE9+ and HTML5 in a GUI) * PDF Related:How to get reference to PDF object embeded in IE * IE on Windows 11

I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions *  EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *

I also encourage you to check awesome @trancexx code:  * Create COM objects from modules without any demand on user to register anything. * Another COM object registering stuffOnHungApp handlerAvoid "AutoIt Error" message box in unknown errors  * HTML editor

winhttp.au3 related : * https://www.autoitscript.com/forum/topic/206771-winhttpau3-download-problem-youre-speaking-plain-http-to-an-ssl-enabled-server-port/

"Homo sum; humani nil a me alienum puto" - Publius Terentius Afer
"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming"
:naughty:  :ranting:, be  :) and       \\//_.

Anticipating Errors :  "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty."

Signature last update: 2023-04-24

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...