WebDriver
The W3C WebDriver API is a platform and language-neutral interface and wire protocol allowing programs or scripts to control the behavior of a web browser.
Introduction
WebDriver API
WebDriver enables developers to create automated tests that simulate user interaction. This is different from JavaScript unit tests because WebDriver has access to functionality and information that JavaScript running in the browser doesn't, and it can more accurately simulate user events or OS-level events. WebDriver can also manage testing across multiple windows, tabs and webpages in a single test session.
WebDriver UDF
The WebDriver UDF allows to interact with any browser that supports the W3C WebDriver specifications. Supporting multiple browsers via the same code base is now possible with just a few configuration settings.
Requirements
(Last modified: 2022/01/25)
The following UDFs need to be installed - independent of the Browser you try to automate:
- JSON UDF (AutoIt) by Ward and Jos
- WinHTTP UDF (AutoIt) by trancexx or autoit-winhttp (GitHub)
- WebDriver UDF (GitHub) by Danp2
One of the following Drivers needs to be installed - depending on the Browser type and version you try to automate:
Browser | Download Link | Comments |
---|---|---|
Chrome | Follow this link to select the correct version depending on the Chrome version you run! | |
Edge | Microsoft | |
Firefox | GitHub | Firefox version ≥ 60 is recommended Note: You must still have the Microsoft Visual Studio redistributable runtime installed on your system for the binary to run. This is a known bug in version 0.26 which the authors weren't able fix for this release. |
Opera | GitHub | The versioning of OperaDriver matches the Chromium version on which Opera browser is based on. |
Limitations
(Last modified: 2020/01/28)
Not all WebDriver functions have been implemented by each browser. To check the status goto the corresponding website below:
- Chrome
- Edge
- Firefox
- Opera: "OperaChromiumDriver is a WebDriver implementation derived from ChromeDriver and adapted by Opera". That's why I think it has at least the same limitations as ChromeDriver.
Big Picture
How the browser independent and browser dependent parts fit together:
Used Terms
(Last modified: 2021/07/28)
You will find the following terms when using WebDriver. We try to shed some light onto this subject here:
CDP (Chrome DevTools Protocol)
Is a protocol that allows for tools to instrument, inspect, debug and profile Chromium, Chrome and other Blink-based browsers.
Marionette
Marionette is an automation driver for Mozilla’s Gecko engine. It can remotely control either the UI or the internal JavaScript of a Gecko platform, such as Firefox. It can control both the chrome (i.e. menus and functions) or the content (the webpage loaded inside the browsing context), giving a high level of control and ability to replicate user actions. In addition to performing actions on the browser, Marionette can also read the properties and attributes of the DOM.
Marionette consists of two parts: a server which takes requests and executes them in Gecko (the Marionette server ships with Firefox), and a client (the Marionette client ships with the GeckoDriver exe). The client sends commands to the server and the server executes the command inside the browser.
For details please visit this site.
ShadowRoot
The ShadowRoot interface of the Shadow DOM API is the root node of a DOM sub tree that is rendered separately from a document's main DOM tree.
For details please visit this site.
Installation
(Last modified: 2022/02/02)
To automate your browser the following installation steps are needed:
- Download the files listed in section "Requirements"
- Move the UDFs to a directory where SciTE and Autoit can find them:
- wd_Core.au3, wd_helper.au3 , wd_cdp.au3 , wd_capabilities.au3 from the WebDriver UDF
- Json.au3 and BinaryCall.au3 from the JSON UDF
- WinHttp.au3 and WinHttpConstants.au3 from the WinHttp UDF
- Move the browser dependent WebDriver to the same directory (with WD_Demo.au3):
- chromedriver.exe (Chrome)
- geckodriver.exe (Firefox)
- msedgedriver.exe (Edge - Chromium) or MicrosoftWebDriver.exe (Edge - EdgeHTML)
- or use "Update" option by choosing one of ComboBox option in WD_Demo.au3
- Run WD_Demo.au3 and select "DemoNavigation" to validate the installation.
The result (for Firefox) displayed in the DOS window should be similar to the following:
1577745813519 geckodriver DEBUG Listening on 127.0.0.1:4444
1577745813744 webdriver::server DEBUG -> POST /session {"capabilities": {"alwaysMatch": {"browserName": "firefox", "acceptInsecureCerts":true}}}
1577745813746 geckodriver::capabilities DEBUG Trying to read firefox version from ini files
1577745813747 geckodriver::capabilities DEBUG Found version 71.0
1577745813757 mozrunner::runner INFO Running command: "C:\\Program Files\\Mozilla Firefox\\firefox.exe" "-marionette" "-foreground" "-no-remote" "-profile" "C:\\ ...
1577745813783 geckodriver::marionette DEBUG Waiting 60s to connect to browser on 127.0.0.1:55184
1577745817392 geckodriver::marionette DEBUG Connection to Marionette established on 127.0.0.1:55184.
1577745817464 webdriver::server DEBUG <- 200 OK {"value":{"sessionId":"925641bf-6c5d-4fe2-a985-02de9b1c7c74","capabilities":"acceptInsecureCerts":true,"browserName":"firefox", ...
Function reference
(Last modified: 2021/12/09)
The functions are now documented in the CHM help file that comes with the UDF.
Capabilities
WebDriver capabilities are used to communicate the features supported by a given implementation. The local end may use capabilities to define which features it requires the remote end to satisfy when creating a new session. Likewise, the remote end uses capabilities to describe the full feature set for a session.
Details can be found at WebDriver Capabilities (sub page).
Google Chrome
ChromeDriver supports "Chrome DevTools Protocol" (CDP) commands (for an explanation of the term and related links please see the Used Terms section).
Details can be found in the CHM Help file that comes with the UDF.
Troubleshooting
Debug the WebDriver setup
(Last modified: 2020/07/06)
WinHTTP UDF
Make sure that you are running at least version 1.6.4.2 (currently unreleased, but can be obtained here).
Chrome
Problem | Solution | Reference |
---|---|---|
When running WD_Demo.au3 it does not start up Chrome and does not display the DOS window for chromedriver. When you manually run the chromedriver in a DOS window you get message "[0.023][SEVERE]: CreatePlatformSocket() returned an error: An invalid argument was supplied." |
This could be caused by missing execution permission for the network drive. Please ask your IT admin for "Applocker" or "application directory whitelisting". Or run the chrome driver from a local HDD and call _WD_Option to set the location of the webdriver executable. Example: _WD_Option("Driver", "C:\Local\WebDriver\chromedriver.exe") |
Stackoverflow |
Firefox
Problem | Solution | Reference |
---|---|---|
When running WD_Demo.au3 it does not start up Firefox and does not display the DOS window for geckodriver. When you manually run the geckodriver in a DOS window you get message "geckodriver: error: An invalid argument was supplied. (os error 10022)" |
This could be caused by missing execution permission for the network drive. Please ask your IT admin for "Applocker" or "application directory whitelisting". Or run the gecko driver from a local HDD and call _WD_Option to set the location of the webdriver executable. Example: _WD_Option("Driver", "C:\Local\WebDriver\geckodriver.exe") |
Stackoverflow |
Debug your Script
FAQ
(Last modified: 2022/02/04)
A: That's described (for Firefox, but should work similar for other browsers) in this post.
A: The console can be completely hidden from the start by adding the following line near the beginning of your script:
$_WD_DEBUG = $_WD_DEBUG_None ; You could also use $_WD_DEBUG_Error
A: This is controlled by your "capabilities" declaration, with each browser using a different method to implement. Here are some examples:
Chrome
$sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args":["--user-data-dir=C:\\Users\\' & @UserName & '\\AppData\\Local\\Google\\Chrome\\User Data\\", "--profile-directory=Default"]}}}}'
$sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"ms:edgeOptions": {"args": ["user-data-dir=C:\\Users\\' & @UserName & '\\AppData\\Local\\Microsoft\\Edge\\User Data\\", "profile-directory=Default"]}}}}'
$sDesiredCapabilities = '{"capabilities":{"alwaysMatch": {"moz:firefoxOptions": {"args": ["-profile", "' & GetDefaultFFProfile() & '"],"log": {"level": "trace"}}}}}'
Func GetDefaultFFProfile()
Local $sDefault, $sProfilePath = ''
Local $sProfilesPath = StringReplace(@AppDataDir, '\', '/') & "/Mozilla/Firefox/"
Local $sFilename = $sProfilesPath & "profiles.ini"
Local $aSections = IniReadSectionNames ($sFilename)
If Not @error Then
For $i = 1 To $aSections[0]
$sDefault = IniRead($sFilename, $aSections[$i], 'Default', '0')
If $sDefault = '1' Then
$sProfilePath = $sProfilesPath & IniRead($sFilename, $aSections[$i], "Path", "")
ExitLoop
EndIf
Next
EndIf
Return $sProfilePath
EndFunc
_WD_Option('DriverParams', '--marionette-port 2828')
A: This is controlled by your "capabilities" declaration. Here are some examples:
Chrome
$sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "binary":"C:\\Path\\To\\Alternate\\Browser\\chrome.exe" }}}}'
$sDesiredCapabilities = '{"desiredCapabilities":{"javascriptEnabled":true,"nativeEvents":true,"acceptInsecureCerts":true,"moz:firefoxOptions":{"binary":"C:\\Path\\To\\Alternate\\Browser\\firefox.exe"}}}'
_WD_Option('DriverParams', '--binary "C:\Program Files\Mozilla Firefox\firefox.exe" --log trace ')
A: Simply call the following function:
_WD_Window($sSession, "Maximize")
A: This is controlled by function "_WD_Option". Example:
_WD_Option("Driver", "C:\local\WebDriver\WebDriver.exe")
A1: Here's a simple way to do it:
$sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//select[@name='xxx']")
$sText = _WD_ElementAction($sSession, $sElement, 'property', 'innerText')
$aOptions = StringSplit ( $sText, @LF, $STR_NOCOUNT)
_ArrayDisplay($aOptions)
'xxx' is the name of the drop-down list.
A2: This can now be accomplished using the function _WD_ElementSelectAction:
$sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//select[@name='xxx']")
$aOptions = _WD_ElementSelectAction ($sSession, $sElement, 'options')
_ArrayDisplay($aOptions)
A: This is controlled by the Capabilities string that is passed to _WD_CreateSession. Example:
$sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args": ["--headless", "--allow-running-insecure-content"] }}}}'
Q: How to configure the UDF to call a user-defined Sleep function, and interact with _WD_WaitElement() and _WD_LoadWait() to make the script more responsive?
A: Try to use: _WD_Option("Sleep") . Example:
#include <ButtonConstants.au3>
#include <GuiComboBoxEx.au3>
#include <GUIConstantsEx.au3>
#include <MsgBoxConstants.au3>
#include <WindowsConstants.au3>
#include "wd_helper.au3"
Global $idAbortTest
Global $WD_SESSION
_Example()
Func _Example()
SetupChrome()
; Create a GUI with various controls.
Local $hGUI = GUICreate("Example")
Local $idTest = GUICtrlCreateButton("Test", 10, 370, 85, 25)
$idAbortTest = GUICtrlCreateButton("Abort", 150, 370, 85, 25)
; Display the GUI.
GUISetState(@SW_SHOW, $hGUI)
ConsoleWrite("- TESTING" & @CRLF)
Local $sFilePath = _WriteTestHtml()
; Loop until the user exits.
While 1
Switch GUIGetMsg()
Case $idTest
_WD_Navigate($WD_SESSION, $sFilePath)
_WD_WaitElement($WD_SESSION, $_WD_LOCATOR_ByXPath, '//a[contains(text(),"TEST")]', 100, 30 * 1000) ; timeout = 50 seconds
ConsoleWrite("---> @error=" & @error & " @extended=" & @extended & _
" : after _WD_WaitElement()" & @CRLF)
Case $GUI_EVENT_CLOSE
ExitLoop
EndSwitch
WEnd
; Delete the previous GUI and all controls.
GUIDelete($hGUI)
EndFunc ;==>_Example
Func _My_Sleep($iDelay)
Local $hTimer = TimerInit() ; Begin the timer and store the handle in a variable.
Do
Switch GUIGetMsg()
Case $GUI_EVENT_CLOSE
ConsoleWrite("! USER EXIT" & @CRLF)
Exit
Case $idAbortTest
Return SetError($_WD_ERROR_UserAbort)
EndSwitch
Until TimerDiff($hTimer) > $iDelay
EndFunc ;==>_My_Sleep
Func _WriteTestHtml($sFilePath = @ScriptDir & "\TestFile.html")
FileDelete($sFilePath)
Local Const $sHtml = _
"<html lang='en'>" & @CRLF & _
" <head>" & @CRLF & _
" <meta charset='utf-8'>" & @CRLF & _
" <title>TESTING</title>" & @CRLF & _
" </head>" & @CRLF & _
" <body>" & @CRLF & _
" <div id='MyLink'>Waiting</div>" & @CRLF & _
" </body>" & @CRLF & _
" <script type='text/javascript'>" & @CRLF & _
" setTimeout(function()" & @CRLF & _
" {" & @CRLF & _
" // Delayed code in here" & @CRLF & _
" document.getElementById('MyLink').innerHTML='<a>TESTING</a>';" & @CRLF & _
" }, 20000); // 20000 = 20 seconds" & @CRLF & _
" </script>" & @CRLF & _
"</html>"
FileWrite($sFilePath, $sHtml)
Return "file:///" & StringReplace($sFilePath, "\", "/")
EndFunc ;==>_WriteTestHtml
Func SetupChrome()
_WD_Startup()
_WD_Option('Driver', 'chromedriver.exe')
_WD_Option('Port', 9515)
_WD_Option('HTTPTimeouts', True)
_WD_Option('DefaultTimeout', 40001)
_WD_Option('DriverParams', '--verbose --log-path="' & @ScriptDir & '\chrome.log"')
_WD_Option("Sleep", _My_Sleep)
Local $sCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "excludeSwitches": [ "enable-automation"]}}}}'
$WD_SESSION = _WD_CreateSession($sCapabilities)
_WD_Timeouts($WD_SESSION, 40002)
EndFunc ;==>SetupChrome
Q: How can I keep my WebDriver environment up-to-date?
WebDriver UDF: Function _WD_IsLatestRelease compares local UDF version to latest release on Github. Returns True if the local UDF version is the latest, otherwise False. If you need to update the UDF you have to download it manually.
WebDriver Exe: Function _WD_UpdateDriver checks or updates the Web Driver with newer version, if available.
Browser: Function _WD_GetBrowserVersion returns the version number of the specified browser. If you need to update the Browser you have to download and install it by hand.
Q: What is a "Locator strategy"?
A: Location strategies are used as a way to find element in HTML DOM. They instruct the remote end which method to use to find an element using the provided locator. Location strategies are used in _WD_FindElement() from wd_core.au3 UDF and all functions form wd_helper.au3 which relates on them.
Q: What is a Selector?
A: Selector is a string that describes how the chosen "Locator strategy" should find the element.
Q: What kind of "Locator strategy" could be used with WebDriver UDF?
A: This UDF supports all locators defined in the Webdriver specifications. Below is a listing of predefined constants:
Locator strategy | Description how to use "Selector" |
---|---|
$_WD_LOCATOR_ByCSSSelector | String a kind of expression which describes how to find element through the HTML DOM. For more information check the links attached below. |
$_WD_LOCATOR_ByXPath | String a kind of expression which describes how to find element through the HTML DOM. For more information check the links attached below. |
$_WD_LOCATOR_ByLinkText | String with exact text of <a> element, which should be used to locate the proper <a> element |
$_WD_LOCATOR_ByPartialLinkText | String with partial text of <a> element, which should be used to locate the proper <a> element |
$_WD_LOCATOR_ByTagName | String with exat name of HTML DOM element for example "ClickMe" is name of this element: <button name="ClickMe"> |
Q: Where I can find information about "XPath" usage?
A: https://www.w3.org/TR/1999/REC-xpath-19991116/
A: https://developer.mozilla.org/en-US/docs/Web/XPath
A: ...
Q: Where I can find information about "CSSSelector" usage?
A: https://www.w3.org/TR/CSS21/selector.html%23id-selectors
A: https://www.w3schools.com/cssref/css_selectors.asp
A: https://developer.mozilla.org/en-US/docs/Learn/CSS/Building_blocks/Selectors
Q: How I can check XPath and CSSSelector in browser?
A: Work in progress....
Q: How I can improve my work with XPath and CSSSelector?
A: Take a look for additionall tools - listed below
Tools
The following tools will help you to automate your browser:
- ChroPath plugin: Makes finding an element by XPath, ID or CSS incredibly easy (Chrome, Firefox, Opera)
- SelectorsHub plugin: Next Gen XPath tool to generate, write and verify the XPath and cssSelectors (All browsers)
References
(Last modified: 2022/01/24)
Further information sources:
- W3C: https://www.w3.org/TR/webdriver/
- AutoIt WebDriver threads:
- Danp2's thread in the Example Scripts forum - WebDriver UDF (W3C compliant version)
- Danp2's closed thread in the General Help and Support forum - WebDriver UDF - Help & Support (this is part I)
- Danp2's closed thread in the General Help and Support II forum - WebDriver UDF - Help & Support (II)
- Danp2's active thread in the General Help and Support III forum - WebDriver UDF - Help & Support (III)
- Water's thread in the General Help and Support forum - Discussion about this Wiki Page
- Water's thread in the Example Scripts forum - WebDriver example scripts collection
- Danp2's thread in the AutoIt Projects and Collaboration forum - Webdriver, Websockets, and Chrome DevTools Protocol
- WebDriver Exe documentation: