Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 9: Working with the Selenium Framework

As highlighted in Chapter 3, Top Web Test Automation Frameworks, Selenium is one of the oldest test automation frameworks on the market. The framework is open source and supports many language bindings (Java, JavaScript, Python, and so on), and it is the base for many other leading frameworks in the marketplace such as WebdriverIO. Being W3C-compliant and based on the WebDriver protocol, this client-server framework allows developers to build test automation across all available browsers (desktop and mobile) and through its Grid tool, run in parallel and at scale. In this chapter, the reader will get a deep technical overview of the framework with a focus on its advanced capabilities, including support for CDP, relative locators, visual testing, cloud testing, support for Behavior-Driven Development (BDD) testing, and self-healing add-ons. The goal of the chapter is to help frontend developers enrich their test automation coverage with the more advanced capabilities of the framework, whether these are built-in features or plugins.

The chapter is designed to cover the following:

Understanding the Selenium framework and its components
The future of the Selenium framework

Technical requirements

The code files for this chapter can be found here: https://github.com/PacktPublishing/A-Frontend-Web-Developers-Guide-to-Testing.

Understanding the Selenium framework and its components

As explained in Chapter 3, Top Web Test Automation Frameworks, the Selenium framework (available at https://www.selenium.dev/) consists of three core pillars – Selenium WebDriver, Selenium IDE, and Selenium Grid (you can read more about the pillars here: https://www.selenium.dev/documentation/grid/getting_started/). In this chapter, we will only focus on the WebDriver protocol with JavaScript language binding and Grid, and leave Selenium IDE for Chapter 13, Complementing Code-Based Testing With Low-Code Test Automation.

Selenium WebDriver

With the release of Selenium 4, the latest release at the time of writing, the framework became fully W3C-compliant (https://www.w3.org/TR/webdriver1/). The richness of the WebDriver protocol enables developers to drive any possible action on a web application, running on all types of browsers.

To get started with Selenium WebDriver, simply install the node package through the following command:

npm install selenium-webdriver

Next, you need to install your relevant browser drivers (Chrome, Firefox, Safari, and so on) by following the Selenium documentation (https://www.selenium.dev/documentation/webdriver/getting_started/install_drivers/).

To start from your JavaScript code on an Edge browser, for example, you add the following lines of code (for Firefox, simply replace edge in the following code with firefox):

const {Builder} = require('selenium-webdriver');

var driver = new Builder().forBrowser('edge').build();

Since we are looking to cover the more advanced capabilities of Selenium, let's start the Selenium Grid component. This step assumes that you have downloaded Grid from the Selenium website (https://www.selenium.dev/downloads/).

To start Selenium Grid, simply run this command:

java -jar .selenium-server-4.1.1.jar standalone

The preceding command refers to the version of Grid that you've downloaded and the path to it.

At this stage, you have successfully installed Selenium for the JavaScript WebDriver package and Selenium Grid and run the Grid command. If everything is okay, when navigating from your local browser to http://localhost:4444, you should see something like the following screenshot:

Figure 9.1 – The local Selenium Grid home page within the local browser

What we will do now is run a simple JavaScript code that navigates to the Packt website and searches for a specific book called UI Testing with Puppeteer, and then validate that the book home page opens successfully:

const {Builder, By, Key, until} =

require('selenium-webdriver');

const by = require('selenium-webdriver/lib/by');

(async function helloSelenium() {

let driver = await new Builder().forBrowser(

'MicrosoftEdge').usingServer(

'http://localhost:4444/wd/hub').build();

await driver.get('https://www.packtpub.com');

await driver.getTitle(); // => "Packt"

let searchBox = await driver.findElement(By.name('q'));

let searchButton = await driver.findElement(

By.className('magnifying-glass'));

await searchBox.click();

await searchBox.sendKeys('UI Testing with Puppeteer');

await searchButton.click();

await driver.getTitle().then(function(title) {

console.log("The title is: " + title)

});

console.log

await driver.quit();

})();

To better understand the preceding code, here are some of the main steps:

Request within the local Selenium Grid ('http://localhost:4444/wd/hub') to open a WebDriver connection in an Edge browser ('MicrosoftEdge').
Specify the website URL to navigate to. In our case, we are navigating to the Packt website home page.
Define two key element locators (objects) for the test to interact with on the web page. In this case, we are defining the Packt website searchBox object and the searchButton object to click. Note that we are identifying the objects by ID and className.
Input a book title in the search box (in our case, we selected 'UI Testing with Puppeteer'), and then click on the search button to navigate to that book's home page.
Lastly, output to the console the page title so that you can see that you've successfully reached the preceding book's home page.

To run the preceding code from an IDE such as Visual Studio Code, simply use the following command:

node . ests est1.js

In this case, I named the JavaScript Selenium test test1.js, which resides in a subfolder called tests.

Assuming there are no environment issues, when running the preceding command, you will get a local Edge browser to launch and run the preceding steps in headed (with the browser UI) mode, and the following output will be shown in the IDE console:

Figure 9.2 – The IDE console output of the page title assertion from the preceding code sample

In the preceding example, we used Grid with only one browser; however, we could have run Grid with the role of hub. When running Grid with such an option, Grid listens on port 4444 and can operate in parallel with all the subscribed nodes. To scale multiple nodes for multiple browser versions when you run the preceding test1.js, you can use a JSON file that will hold these configurations:

java -Dwebdriver.chrome.driver=chromedriver.exe -jar selenium-server-standalone.jar -role node -nodeConfig node1Config.json

To learn more about configuring JSON-based Selenium Grid nodes, please refer to the following documentation: https://www.selenium.dev/documentation/legacy/grid_3/setting_up_your_own_grid/.

Now that we have set up Selenium locally, launched Selenium Grid, and run a nice JavaScript Selenium test, let's highlight the most useful and advanced features of Selenium 4.

The advanced features of Selenium

Selenium, as mentioned throughout this book, is a powerful and mature test automation framework, and as we learned in the preceding test code, with very little programming effort, you can perform many actions on a frontend web application. Modern websites are written with powerful web application frameworks, as described in Chapter 5, Introducing the Leading Frontend Web Development Frameworks, which enable developers to enrich their websites and add complex logic and components to their pages. Realizing the power of a test automation framework such as Selenium can help to automate and test with wide coverage these advanced components.

Let's start covering a newly added feature of Selenium 4 called relative locators.

Selenium relative locators

In previous versions of Selenium, it was quite challenging to identify an element on a web page, especially if there were other elements that are similar or when the web page was too crowded with UI elements. For that purpose, the Selenium community implemented a new feature that allows test developers to clearly specify an element location and a name relative to other elements on that page.

If we look at the Packt website under test that we covered in the previous test code, we will see that there are three buttons on the upper viewport of that page. While each of these elements has unique text that describes them, all three buttons have the same className identifier ('subscribe_cta'). With Selenium 4, we can be very accurate in identifying each of them by utilizing relative locators:

Figure 9.3 – A Packt home page screenshot used for an example of multiple close elements

To identify the middle button in the preceding screenshot with the Essential Bundles text, we will specify the following code that specifies that this button is to the right of the Enter the SALE button, as follows:

let enterSale = driver.findElement(By.className(

subscribe_cta));

let essentialBundles = await driver.findElement(

locateWith(By.className('subscribe_cta')).toRightof(

enterSale));

Selenium 4 provides five relative locator options:

above
below
toLeftOf
toRightof
near

Documentation on all of these locators is available here: https://www.selenium.dev/documentation/webdriver/elements/locators/#relative-locators.

The Selenium Chrome debugging protocol

While we mentioned CDP in the context of the Playwright and Puppeteer frameworks, for Selenium, this is a new feature that was introduced with Selenium 4. With Selenium 4, frontend web application developers can utilize the DevTools interface (you can learn more here: https://www.selenium.dev/documentation/webdriver/bidirectional/chrome_devtools/) to connect with the CDP and use features such as network control, geolocation emulation, performance, accessibility, a profiler, and an application cache.

Through the newly added CDP connection option (driver.createCDPConnection(page');), the preceding capabilities can be used.

We will not expand on all of the CDP's available APIs but will give a simple example from the Selenium community of using the CDP to set a specific geographical location through a test.

In the following code snippet, we navigate to a free geolocation website (https://my-location.org) that is offered as a reference to identify, via latitude and longitude coordinates, the exact location:

await driver.get("https://my-location.org/");

const pageCdpConnection =

await driver.createCDPConnection('page');

//Latitude and longitude of Tokyo, Japan

const coordinates = {

latitude: 35.689487,

longitude: 139.691706,

accuracy: 100,

};

await pageCdpConnection.execute(

"Emulation.setGeolocationOverride",

coordinates

);

Within the preceding code sample, we navigate to the website as if we were based in Tokyo, Japan, by providing the geo-coordinates and executing the command through the CDP connection. Such a testing capability is important, since many websites are location-aware by design. Hence, based on the end user location, a specific output will be displayed and, sometimes, even with a language relevant to that location.

The CDP within Selenium 4 is a powerful capability with many features and is very important to know about and use within the testing suites.

Selenium multi windows and tab management

Selenium 4 also enriches the testing of complex websites with multiple tabs and windows (you can learn more here: https://www.selenium.dev/documentation/webdriver/browser/windows/). Within any traditional website, there are various menus that open new windows and tabs that should be tested in an automated fashion. Unlike frameworks such as Cypress, Selenium provides a good method of testing multiple tabs, including new windows. To perform tests on a web application when you need to switch between windows or tabs, Selenium provides APIs, including getWindowHandle();, getAllWindowHandles();, and driver.switchTo().newWindows('tab'), to open a new window, switch to a new tab, close an active window or a tab, and so on.

Selenium Actions APIs – support for mouse and keyboard events

With Selenium, frontend developers can utilize mouse and keyboard events to perform actions (https://www.selenium.dev/documentation/webdriver/actions_api/) on given web pages within the web application under test. The ability to send text strings to a web element such as a textbox as well as perform an Enter key press on a keyboard through Selenium isn't new but very useful:

await driver.findElement(By.name('q')).sendKeys(

'webdriver', Key.ENTER);

In addition, Selenium also offers a wide range of mouse events such as clickAndHold, doubleClick, and dragAndDrop. In the following line of code, Selenium performs a mouse action that performs a drag from the source element (sourceEle) and a drop onto the target element (targetEle):

await actions.dragAndDrop(sourceEle, targetEle).perform();

Self-healing scripts

While this section won't look at Artificial Intelligence (AI), Machine Learning (ML), or low code within testing, the community has built a very interesting framework on top of Selenium to stabilize test code and, whenever possible, fix the test executions to reduce the level of brittleness. This project is called Healenium (https://healenium.io/), and its value proposition is to improve Selenium test cases' stability and better handle dynamic changes to web elements. With a nice set of documentation (available at https://github.com/healenium/healenium-example-maven), code samples, and IDE plugins (such as for IntelliJ IDEA), this framework is a great add-on to any Selenium project. The following is an example of how Healenium works:

Figure 9.4 – An example of how Healenium updates broken locators within IntelliJ IDEA (source – https://github.com/healenium/healenium-example-maven/blob/master/img_4.png)

Selenium Grid in the cloud

While this is not unique to the Selenium project, teams can leverage cloud providers and run their Selenium test code in the cloud at scale and in parallel without worrying about setting up and maintaining a local Grid.

Sauce Labs, Perfecto, and BrowserStack offer a robust cloud-based Selenium Grid that covers all different browser/OS combinations across geographies, so frontend developers and testers can scale up their test executions and reduce the amount of time a test cycle takes compared to when running it locally:

Figure 9.5 – Selenium Grid in the Perfecto cloud with support for all web and mobile combinations

Various testing methods with Selenium

In this section, we are going to cover a few testing types that are supported by the Selenium framework. We will cover testing that includes BDD with Cucumber, accessibility testing, and visual testing.

BDD testing with Selenium

In this section, we will not provide a thorough deep dive into BDD. However, it is important to understand that BDD can be used with Selenium easily. In the context of agile testing practices, developers who are building software through the BDD method create the test scenarios in Gherkin, which follows the built-in keyword-driven syntax based on GIVEN, WHEN, and THEN. In the following screenshot, we can see an example of a Cucumber test scenario in Gherkin:

Figure 9.6 – A Cucumber test scenario in Gherkin with a Selenium-based step definition methods

In the preceding screenshot, we can see a full Gherkin-based test scenario with the annotation name of WebDD. The scenario simply navigates to a Google search page to find two search keys that are provided within a table (a data-driven test). A data-driven test within Cucumber is defined using the Examples keyword. Previously, we created two types of data input for the test under the Examples block.

On the right side of the preceding screenshot, there is a simple Selenium code, in Java, that through the WebDriver protocol navigates to the Google home page and clicks on the search button on the page. Basically, with Selenium and BDD (Cucumber), frontend developers and testers can build all possible testing scenarios and run them through Grid or the cloud providers.

Creating test scenarios with the underlying step definitions in Selenium and JavaScript is a great way to create test automation. BDD is all about putting developers, testers, and business-facing staff on the same page through clear product scenarios that are written in the form of a user story, with a functional test that validates it. A few years ago, I delivered a deep workshop on testing with BDD, and you can find some interesting insights in the guide I created: https://www.slideshare.net/ek121268/mastering-bdd-eran-kinsbruner-workshop-quest-2018.

Visual testing with Selenium

As highlighted briefly in Chapter 7, Core Capabilities of the Leading JavaScript Test Automation Frameworks, practitioners can grab screenshots for basic visual assertions with Selenium's built-in functions, as well as utilize some of the tools and frameworks out there, which include Storybook, Galen, and Percy. To conduct advanced visual testing with Selenium, leverage AI and ML capabilities, and increase testing scale, it is also possible to integrate the Applitools Eyes SDK into Selenium across its language bindings, generate baselines, and perform visual assertions at a much higher quality. Figure 9.1 shows an Applitools visual test result, which we can further analyze:

Figure 9.7 – The Applitools visual test results within their web-based dashboard

From the preceding visual, if we focus on the Unresolved test case titled 2/2 App Window, the solution will be able to spot visual differences between that screen and the saved baseline.

Figure 9.8 shows the analysis of the differences, which allows a practitioner to either waive these differences as not an issue or report them as a regression bug:

Figure 9.8 – A deep analysis of an unresolved test case with the differences highlighted in pink

To get started with Applitools and Selenium with JavaScript language binding, follow the simple steps in the documentation: https://applitools.com/tutorials/selenium-javascript.html.

Basically, you need to create an account and obtain an Applitools API Key, and then install the node package by running the following command:

npm install @applitools/eyes-selenium –save-dev

The SDK offers a few easy APIs to visually analyze the web applications under test. Developers can use eyes.open();, eyes.check();, and eyes.close(); within their Selenium test code and report all captures onto the cloud-based dashboard. There is a ready-to-use open source project offered by Applitools on GitHub (available at https://github.com/applitools/tutorial-selenium-javascript-basic) that can be used to get started with the SDK.

Accessibility testing with Selenium

As highlighted in Chapter 7, Core Capabilities of the Leading JavaScript Test Automation Frameworks, all leading test automation frameworks can easily integrate with the leading axe accessibility SDK provided by Deque and create accessibility tests within their functional test code.

A very useful GitHub repository is offered by Deque with access to the accessibility engine (axe-core), code samples, and so on at https://github.com/dequelabs.

To install the engine, simply run the following command in the folder where you have the Selenium project:

npm install axe-core --save-dev

In addition to the preceding installation, you will need to specify within your JavaScript test code the path to the axe accessibility spec file (in the following code example, the file is axe.min.js):

const {Builder, By, Key, until} =

require('selenium-webdriver');

const fs = require('fs')

const by = require('selenium-webdriver/lib/by');

(async function helloSelenium() {

let driver = await new Builder().forBrowser(

'MicrosoftEdge').usingServer(

'http://192.168.1.157:4444/wd/hub').build();

await driver.get('https://www.packtpub.com');

const data = await fs.readFileSync(

'node_modules/axe-core/axe.min.js','

utf8'

)

await driver.executeScript(data.toString());

let result = await driver.executeAsyncScript('var callback

= arguments[arguments.length -1];axe.run().then(results

=> callback(results))');

await fs.writeFileSync('tests/report.json',

JSON.stringify(result));

await driver.getTitle(); // => "Packt"

let searchBox = await driver.findElement(By.name('q'));

let searchButton = await driver.findElement(

By.className('magnifying-glass'));

await searchBox.click();

await searchBox.sendKeys('UI Testing with Puppeteer');

await searchButton.click();

await driver.getTitle().then(function(title) {

console.log("The title is: " + title)

});

console.log

await driver.quit();

})();

This spec file holds the relevant accessibility checks, based on which the test will run and report the results back to the user. If we take the preceding test1.js source code and add the bold marked lines, it will not only open the Packt website and search for the UI Testing with Puppeteer book but will also perform an accessibility check via the axe.run() method and store the entire results in a report.json file.

After the preceding test completes and a report.json file is generated, it will include a set of arrays broken by the violations, passes, incomplete, and inapplicable results, as follows:

Figure 9.9 – A sample accessibility violation, detected by running Selenium with the axe SDK on the Packt website

The entire preceding project is also available in my GitHub repository, which you can clone and build from here: https://github.com/PacktPublishing/A-Frontend-Web-Developers-Guide-to-Testing/tree/master/Selenium_examples.

Upgrading your Selenium code to Selenium 4

This is not an advanced feature of Selenium. However, to enjoy the new features of Selenium 4, you will need to upgrade your version to the latest one. With any such upgrade, you will need to ensure that your test code is W3C-compliant and adheres to the new syntax of a framework. Here, we will not provide a complete guide to migrate Selenium 3 and below to Selenium 4. If you need information for the complete changes that need to be considered as part of the migration, please refer to this link: https://www.selenium.dev/documentation/webdriver/getting_started/upgrade_to_selenium_4/. As an example, let's focus on the changes to the most used feature of Selenium, which is the findElement method. This is used to identify an object within a web application and perform any kind of action on it. Within the preceding link, you will find a consolidation of changes that need to happen to your Maven and Gradle dependencies within your IDEs and changes to the desired capabilities, such as platform, browserName, and others. In addition, the preceding guide covers changes that you need to know and perform to your waits and timeout usage, and deprecation of old APIs that aren't supported anymore.

With the preceding accessibility summary, we can wrap up the highlights of Selenium framework capabilities. Selenium is, of course, richer than just the preceding features and includes the Page Object Model (POM) design pattern (you can learn more about POM here: https://learn-automation.com/page-object-model-using-selenium-webdriver/), different waiting methods, and other useful APIs that can be learned and used outside of this chapter.

Now that we have covered both the Selenium project components, and its core and important features that we need to be familiar with, let's explore the future of the Selenium project and how it will stack up against other advanced frameworks and the rise of AI and low-code technologies.

The future of the Selenium framework

While Selenium 4 marks a phenomenal milestone for this framework and cross-browser testing technology, as highlighted in the book, frontend developers have other options and competitive frameworks to choose from. To remain relevant and shine as it has over the many years since it was launched, Selenium and its community need to think about the future of modern web applications such as PWAs, Flutter, and React Native.

In the age of intelligent testing and analysis, with digital apps also becoming more complex and demanding, test automation frameworks including Selenium and others must also become richer and more capable. In Selenium 4, the community launched a modified version of Selenium IDE that records all the user actions in a browser, including all the web elements, and can export the recorded script into code. Projects such as Healenium that were covered in this chapter should not be created, and instead, the abilities within Healenium should be part of the Selenium core project.

In future releases, such tools should be able to perform more complex test creation activities, generate reports, ensure no flakiness in the resultant script, and much more. With Cypress's experimental project called Cypress Studio, the Cypress team is already aiming higher in its test recording technology.

The core pillars of a futuristic test automation framework such as Selenium should be able to also match all business roles that are doing test automation. Developers, SDETs, and manual testers should find it easier to work and set up the test framework. At the time of writing, Selenium isn't considered the easiest ramping-up test automation framework to use compared with Cypress and Playwright.

Selenium, as a great multichannel framework that can support both web platforms and mobile ones, should continue evolving its APIs and capabilities and run its roadmap in parallel with the Appium tools so that it remains a unique offering for such application types.

To conclude this section, the hope of practitioners that use Selenium is to have a more advanced AI-based, self-healing, and very capable testing technology that can ease the ramping up, maintenance, and analysis of test runs across web and mobile apps of any kind.

Summary

In this chapter, we started by providing a recap of the Selenium project core pillars and how to get started with the basic Selenium framework. We then zoomed in and went deeper into the core features and abilities of the Selenium test automation framework. We highlighted the features with a practical example on how to get started and use these features, as well as providing some ready-to-use code samples that can help you build an advanced testing project for your web application.

We also offered a more futuristic vision for such a test automation framework, looking at desirable capabilities that practitioners are lacking today and could find useful going forward.

That concludes this chapter! In the following chapter, we will do the exact same analysis as we did for Selenium but for the Cypress test automation framework.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 9: Working with the Selenium Framework

Create new playlist

Sign In

Sign Up