Selenium 4

A Five-Minute Guide to WebDriver APIs

Veera.
4 min readApr 22, 2023

To say the least, Selenium 4 is a leap forward in browser automation, including ground breaking changes, beside laying the solid foundation for the futuristic developments like moving towards WebDriver BiDi using ChromeDevTools (CDT).

Looking back at the evolution, It was initially started off using JSON Wire specification as a standard protocol and afterward, it was used alongside WebDriver APIs until Selenium 3.11; At the same time, WebDriver APIs were also built compliant with W3C standards and thus, the later releases of Selenium settled for WebDriver APIs as a standard protocol to automate browsers and in parallel, deprecated JSON Wire protocol.

Given this, my vision on this blog is to dissect the internals of Selenium 4 by exploring WebDriver APIs and interacting with a browser directly using them.

WebDriver is an API and protocol that defines a language-neutral interface for controlling the behaviour of web browsers. Each browser is backed by a specific WebDriver implementation, called a driver. The driver is the component responsible for delegating down to the browser, and handles communication to and from Selenium and the browser.

Selenium 4

As you can see, anchoring between tests and browser, the driver component plays a crucial role in interpreting and relying requests coming in from the tests to a browser instance. Once the request is carried out by the browser, it sends the response back to the tests through the same driver; the tests thus can act on the response received as they intend.

To do this experimentally, Let’s start a chrome driver by executing the following command, assuming that the executable is in current directory.

# For Mac
./chromedriver
# Windows
# chromedriver.bat
For Chrome, the driver listens on the port 9515 by default, and it changes for every driver. I could see that the server started successfully, and is running, listening for the requests from the tests (clients) and I'll be sending out a request one after another to create a new session and to simulate some interactions on UI.
— POST request to create a new session; any capability can be passed through the payload and the below curl has a simplified payload to create a chrome browser session.
curl --location 'http://localhost:9515/session' \
--header 'Content-Type: application/json' \
--data '{
"capabilities": {
"alwaysMatch": {
"browserName": "chrome"
}
}
}'

Upon firing this request, one could see that the browser gets launched and looks out for a property: sessionId that is a 32 character alphanumeric string in the response, and is required If one were to use the same session for further interactions with the newly created browser.

— POST request to open a Url on the browser, and the key point to notice is, the path parameter :sessionId on the Url which has to be updated with the value of sessionId returned on the previous response.

curl --location 'http://localhost:9515/session/:sessionId/url' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://github.com/5v1988/qa-wdio-js"
}'

This request opens the url: https://github.com/5v1988/qa-wdio-js on the browser tab and returns 200 OK & null as response, which is of no significance for this action.

— POST request to locate an element using css locator and it returns an element id when finding it successfully.

curl --location 'http://localhost:9515/session/:sessionId/element' \
--header 'Content-Type: application/json' \
--data '{
"using": "css selector",
"value": "input[aria-label='\''Search'\'']"
}'

The above curl locates the search text box on Github repository and returns a hyphened 32 alpha-numeric character string as an element id on successfully locating the element.

— POST request to click on element by using sessionId & elementId.

curl --location 'http://localhost:9515/session/:sessionId/element/:elementId/click' \
--header 'Content-Type: application/json' \
--data '{}'

This just clicks on the search text box by referring its element id returned in the last request.

— POST to enter value in a text box by using the path parameters - sessionId and elementId.

curl --location 'http://localhost:9515/session/:sessionId/element/:elementId/value' \
--header 'Content-Type: application/json' \
--data '{
"text": "selenium"
}'

This curl enters the text 'selenium' onto the search text box using session id and element id returned in the earlier responses.

— an another POST request to locate an element; However, it is located using XPath this time.

curl --location 'http://localhost:9515/session/:sessionId/element' \
--header 'Content-Type: application/json' \
--data-raw '{
"using": "xpath",
"value": "//a[@*='\''global_search'\'']"
}'

It returns reference (elementId) to an element corresponding to the link that displays underneath the search box as one types on it.

—an another POST request to click an element; It's important to note the empty json on the payload as HTTP POST method requires it.

curl --location 'http://localhost:9515/session/:sessionId/element/:elementId/click' \
--header 'Content-Type: application/json' \
--data '{}'

This is the curl that click on the suggestion hyperlink underneath the search box.

— Additionally, GET request to take a screenshot of the browser, and it returns Base64-encoded PNG image data containing the screenshot of the initial viewport.

curl --location 'http://localhost:9515/session/:sessionId/screenshot'

— DELETE request to delete the already created session; It closes the browser created previously upon a successful response; however, at this moment the driver still listens for requests until it shuts down.

curl --location --request DELETE 'http://localhost:9515/session/:sessionId'

That's enough for now.

I believe, this blog gives much insights about the communications between automated tests and a browser and you find this helpful. Thank you for reading!

Veera.

--

--

Veera.
Veera.

Written by Veera.

I'm a Software QE professional with over 15 years of industry experience; https://www.linkedin.com/in/sw-tester

No responses yet