Chapter 4: Interacting with a page

Thanks to Chapter 3, Navigating through a website, we now know how to open a browser and all the different options we have to launch browsers and create new pages. We also know how to navigate through other pages. We learned about HTTP responses and how they are related to a request.

This chapter is about interaction. Emulating user interaction is essential in UI testing. There is one pattern in unit testing called Arrange-Act-Assert (AAA). This pattern enforces a particular order in the test code:

  • Arrange – Prepare the context.
  • Act – Interact with the page.
  • Assert – Check the page reaction.

In this chapter, we will learn how to find elements on a page. We will understand how the development team can improve their HTML so that you can easily find elements. But if you cannot change the page HTML, we will also look at another set of tools to find the elements we need.

Once we find an element, we will want to interact with it. Puppeteer provides two sets of APIs: One is action functions, such as click, select, or type. Then we have a set of emulation functions, such as mouse events or keyboard emulation. We will cover all those functions.

This chapter will introduce a new object we haven't mentioned yet: The element handle.

By the end of this chapter, we will have added another tool to our toolbox: The Visual Studio Code debugging tools.

We will cover the following topics in this chapter:

  • Introduction to HTML, the DOM, and CSS
  • Finding elements
  • Finding elements using XPath
  • Interacting with elements
  • Keyboard and mouse emulation
  • Interacting with multiple frames
  • Debugging tests with Visual Studio Code

By the end of this chapter, you will be able to emulate most types of user interaction. But first, we need to lay the groundwork. Let's talk about HTML, the Document Object Model (DOM), and CSS.

Technical requirements

You will find all the code of this chapter on the GitHub repository (https://github.com/PacktPublishing/UI-Testing-with-Puppeteer) under the Chapter4 directory. Remember to run npm install on that directory and then go to the Chapter4/vuejs-firebase-shopping-cart directory directory and run npm install again.

If you want to implement the code while following this chapter, you can start from the code you left in the Chapter3 directory.

Introduction to HTML, the DOM, and CSS

You won't be able to find elements if you don't know CSS, and you won't understand CSS if you don't understand the DOM and HTML. So, we need to start with the basics.

I bet you've heard that you can build a site with HTML, CSS, and JavaScript. You might be using different server-side technologies. Your frontend might be implemented using cool technologies such as React or Angular. But in the end, the result will be a page based on HTML, CSS, and JavaScript.

HTML is the page's content. If you go to any website, open the DevTools, and click on the Elements tab, you will see the content of the page. You will see the page's title. If it's a news site, you will see all the articles there. If you visit a blog post, you will see the text of that post.

Without CSS, an HTML page would look like text written in Notepad. CSS not only brings color and fonts, but it's also the scaffolding that gives structure to a page.

Fun fact

Firefox has a built-in tool to disable all the styles on a page. If you go to View | Page Style and click on No Style, you will see how our life would be without CSS.

The last piece is JavaScript. JavaScript brings behavior to a page. Once the browser parses the HTML and builds the DOM, it allows us to manipulate and give life to a page.

But, as I mentioned before, we need to go to the basics, to the foundations of the web. Let's begin with HTML.

HTML

HTML stands for HyperText Markup Language: HyperText because the HTML is not content per se; HTML contains the content. Markup because it uses tags to give meaning to that content. And language because, although many developers disagree and they get mad about the idea, HTML is a language.

If we read an HTML file as a data structure, we can say that HTML is a relaxed version of XML. So, to better understand HTML, we need to look at the basics of XML.

These are the basic elements of XML content:

XML Content

XML Content

If you look at this figure, you already know almost everything you need to know about XML. Well, maybe I'm exaggerating. But this is the idea:

  • You have elements, which are represented as <ElementName>. In our example, we have <element> and <child-element>.
  • The element might have attributes, which are represented as AttributeName=" AttributeValue". We have value="3" and value="4" in our example.
  • The element might contain other elements. You can see we have two child-element elements inside the main element.
  • An element finishes (is closed) with </ElementName>, or with /> at the end instead of >.

XML parsers are very strict with these rules. If the XML content you are trying to parse breaks just a single rule, the parser will consider the entire XML invalid. Whether it's a missing closing element or an attribute without quotes, the parser will fail to evaluate the XML content.

But we will find that browsers are not that strict when parsing HTML content. Let's take a look at the following HTML:

A broken HTML

A broken HTML

This simple HTML will print Hello World in red in the browser.

Is this valid XML? No. As you can see, the <div> element is not closed. But is this valid HTML? Yes.

Important Note

The fact that a browser would try to render broken HTML doesn't mean that you should take that lightly. It's possible you have heard a developer say that a particular bug was due to a missing closing div. If the HTML is broken, for instance, it has a missing closing div, the browser will try to guess the best way to render that HTML. The decision the browser makes when trying to fix broken HTML could end up with the page working as expected or with the full page layout broken.

Another interesting concept is that the XML specification doesn't give meaning to the elements. The names of the elements, the attributes, and the resulting information coming from that content depend on who wrote the XML and who is reading it.

HTML is XML with meaning. In 1993, Tim Berners-Lee, who is known as the inventor of the World Wide Web, decided that the main element would be called HTML and that it would contain a BODY. He decided that images would be represented as IMG elements, paragraphs would be P elements, and so on. Over the years, browser and web developers followed and improved this convention, getting to what we today call HTML5. We, as a community, agreed on the meaning of HTML elements.

We agreed that if we add the text attribute with the value red, we will get the text in red, and so on. How many types of elements do we have in HTML? A lot! The good news is that you don't need to know all of them.

The more you know, the more productive you will be. However, these are the most common elements you will find on a page.

Document structure elements

Every HTML document will be contained inside an <html> element. That HTML element will have two child elements. The first element you will find is <head>. Inside that <head> element, you will find metadata elements, such as <title> with the page title, and many <meta> elements with metadata not supported by the standard HTML. Many sites use <meta> to enforce how the page should be shown on social media. The second set of elements you will find are include elements: <link> elements, including CSS files, and <script> files, including JavaScript code. Although the script elements are accepted in the header, most sites would add their script elements at the bottom of the page for faster rendering.

The second element you will find is the <body> element. The page itself will be inside this element.

Text elements

Then we have the basic text elements.

<h1>, <h2>, <h3>, <h4>, <h5>, and <h6> are headings. If you have a text editor, you might have seen that there are many levels of headings and subheadings.

<p> will denote paragraphs. Then you might find <span> elements, which help style part of the text in a paragraph.

Another type of text element is <label>. These labels are linked to an input control, such as a radio button, giving context to that control. For example, a radio button or a checkbox doesn't have text; it's just a check or a radio. You need a label to give them context:

Radio buttons with labels

Radio buttons with labels

This HTML has three labels. Huey gives context to the first radio option, Dewey to the second, and Louie to the last one.

The last type of text element we will look at is list elements. Lists are expressed as a parent element, <ul> for unordered lists or <ol> for ordered lists, and <li> elements. You will see lots of these in menu bars.

Action elements

There are two main action elements in HTML. The <a> anchor, also known as a link, was designed to take you to another page, but these days it's not limited to that, and it could trigger actions inside the page.

The second element is <button>, which again, although it was designed to send data to the server using an HTTP POST request, is now being used for many other kinds of actions:

Important note

The days when you would only use buttons and links to perform actions are in the past. As most HTML elements support click events, you will find pages that show elements as buttons, but in fact, those buttons are HTML elements such as DIVs.

Links and buttons at packtpub.com

Links and buttons at packtpub.com

Many times, you won't notice the difference between a link and a button. For instance, in the packtpub.com site, the search button is a button element, whereas the cart button is, in fact, an anchor.

Most of your automation code will involve clicking on these action elements.

Container Elements

The role of container elements is grouping elements, mostly for layout and style purposes. The most popular element is DIV. What is DIV? It can be anything: A list of items, a popup, a header, anything. It is used to create groups of elements.

One element that was the king of the container elements was TABLE. As you can infer from the name, a table represents a grid. Inside a TABLE element, you can have TR elements representing rows, TH elements representing header cells, and TD elements representing a column inside a row. I mentioned that this was the king of containers because the community has now moved on from tables to DIVs due to performance issues, the need for more complex layouts, and responsiveness issues. But you might still see some tables on sites showing information using a grid style.

HTML5 brought a new kind of container element: The Semantic Elements. The goal of these semantic HTML elements is to communicate the type of content the element contains. So, instead of using DIVs for everything, developers should start using elements such as <header> for the site header, <footer> for the footer, <nav> for the navigation options, <articles> for blog posts, and so on. The purpose of these elements is to help external tools (such as screen readers, search engines, and even the same browser) to understand the HTML content.

Input elements

The last group of elements we need to know about are the input elements. The most common input element is the multifaceted input element. Depending on the type attribute, it can be "text", "password", "checkbox", "file" (upload), and so on; the list goes on to a total of 22 types.

Then we have select elements for drop-down lists and the option element to represent the items of a drop-down list.

Of course, we shouldn't forget the <IMG> element. It's impossible to picture a site without images.

Important note

Not every input you will see these days will be one of these elements. To make inputs more user-friendly or just nicer, you will find that developers might build inputs based on many other elements. For instance, you could find a drop-down list, which instead of being a select element would be an input element, plus an arrow button, which would show a floating list on clicking it. This kind of control makes sites prettier but automation more challenging.

HTML has not only a known list of elements but also a known list of attributes. These are the most common attributes you will find:

  • id: Identifies a unique element. It's the element ID in the DOM (we will talk about the DOM in the next section).
  • class: Contains the CSS classes applied to the element. It accepts more than one CSS class separated by a space.
  • style: CSS style assigned to the element.

HTML won't limit the attributes you can add to an element. You can add any attribute you want, for instance, defaultColor="blue". One convention is using data- attributes (pronounced data dash attributes). The browser will parse these attributes and make them available in the DOM. So, although defaultColor is a valid attribute, the general convention uses data-default-color="blue" instead.

The other set of attributes of interest to us is the Accessible Rich Internet Applications (ARIA) attributes. These attributes are being added to help accessibility tools, such as screen readers. Why would we be interested in those attributes? Because developers express things such as the role or the state of an element. If you find a site using ARIA, finding the selected menu item would be a matter of finding the element with role="treeitem" and aria-expanded="true".

In the past few paragraphs, the DOM has been mentioned a few times. Let's talk about the DOM.

The DOM

The DOM is the interface you can use in JavaScript to interact with the HTML. According to the MDN (https://www.hardkoded.com/ui-testing-with-puppeteer/dom), it is the data representation of the objects that comprise the structure and content of a document on the web. Why should we care about that? Because we are going to use the same tools to automate our pages.

In the previous section, we mentioned that an element might have an ID. You'll find that the search input at https://www.packtpub.com/ has the ID search, so you will be able to get that element in JavaScript using document.getElementById('search').

You might be wondering: How do I know the ID of a button? Or how do I check that the ID is valid? Remember we talked about the dev tools?

The developer tools can be opened by clicking on the three dots in the top-right corner of Chrome and then going to More Tools | Developer Tools. You can also use the Ctrl + Shift + J shortcut in Windows or Cmd + Option + I in macOS:

Developer Tools

Developer Tools

If you right-click on any element on the page, for instance, the search button, you will find the Inspect option, which will select that element in the Elements tab. There you will be able to see all the attributes of that element:

Inspect option

Inspect option

Another tab you will use a lot is the Console tab, where you will be able to run JavaScript code. If you are in the Elements tab and press the Esc key, you will get the Console tab below the Elements one. From there, you will be able to test your code:

Console tab

Console tab

Another set of functions that you will use a lot are document.querySelector and document.querySelectorAll. The first function returns the first element matching a CSS selector, whereas the second function returns a list of elements matching a CSS selector. So, we need to learn about some CSS selectors next.

CSS Selectors

You don't need to learn CSS to understand how to style a page, but you should master how to find elements on a page. There are around 60 different selectors (https://www.w3schools.com/cssref/css_selectors.asp) we can use for finding elements. We won't cover all 60 here, but let's go through the most common selectors:

  • Select by element name:

    Selector: ElementName.

    Example: input will select <input> elements.

  • Select by class name:

    Selector: .ClassName.

    Example: .input-text will select any element that contains the input-text class.

    If you look at the search input in https://www.packtpub.com/, the class attribute is class="input-text algolia-search-input aa-input". This selector won't check whether the class attribute is equal to input-text. It has to contain it.

  • Select by ID:

    Selector: #SomeID.

    Example: #search will select the element with the search ID. In this case, it does check equality.

  • Select by attribute:

    Selector: [attribute=value].

    Example: [aria-labelledby= "search"] will select the element with the aria-labelledby attribute with the value search. This is an excellent example of the use of ARIA attributes for automation.

This selector is not limited by only the equality check (=). You could use only [attribute] to check whether the element contains the attribute, no matter the value. You can also use many other operators. For example, you can use *= to check whether the attribute contains a value or |= to check whether it begins with a value.

Combining selectors

What's great about CSS is that you can combine all these selectors. You could use input.input-search[aria-labelledby=" search"] to select an input with the input-search class and the aria-labelledby attribute with the value search.

You can also look for child elements. CSS allows us to "cascade" (that's what the C in CSS stands for) selectors. You can search for child elements by adding new selectors separated by a space. Let's take, for instance, the following selector:

form .algolia-autocomplete input

If you read it backwards, it will select an input inside an element with the algolia-autocomplete class, which is inside a form element. Notice that I said an input inside an element with the algolia-autocomplete class. That doesn't need to be the direct parent of the input element.

If you want to check strictly a parent-child relationship, you can separate selectors with a > instead of a space:

.algolia-autocomplete > input

This selector will look for an input whose direct parent element is an element with the algolia-autocomplete class.

Maybe you are thinking, why do I need to know all this information? I just want to get up and running with Puppeteer! Let me tell you something: You will spend half of your time inside the developer tools, and the most frequent element in your code will be a CSS selector. The more you know about HTML, the DOM, and CSS, the more proficient you will be at browser automation.

But now it's time to go back to the Puppeteer world.

Finding elements

It's time to apply everything we have learned so far. We need to master selectors because our Puppeteer code will be mostly about finding elements and interacting with them.

Let's bring back the login page from our e-commerce app:

Login page

Login page

If we want to test the login page, we need to find these three elements: The email input, the password input, and the login button.

If we right-click on each input and click on the Inspect element menu item, we will find the following:

  • The email has the ID email.
  • The password has the ID password.
  • The login is a button element, with the btn and btn-success CSS classes, and the style=" width: 100%;" style.

Puppeteer provides two functions to get elements from the page. The $(selector) function will run the document.querySelector function and return the first element matching that selector or null if no elements were found. The $$(selector) function will run the document.querySelectorAll function, returning an array of elements matching the selector or an empty array if no elements were found.

If we want to implement the login function in our LoginPageModel class using these new functions, finding the login inputs would be easy:

const emailInput = await this.page.$('#email');

const passwordInput = await this.page.$('#password');

Tip

To find the login button, you might think that you could use the btn-success selector, and you could, but you shouldn't use classes used to style a button because they might change in the future if the development team changes the style. You should try to pick a CSS selector to overcome a design change.

Let's re-evaluate our login button. If you look for button elements, you will find that you have five buttons on that page, so the button selector won't work. But, we can see that the login button is the only button with a type="submit" attribute, so we could use the [type=submit] CSS selector to find this element.

But the [type=submit] selector is too generic. The developers might, for instance, add a new button with the submit type in the toolbar, breaking our code. But we can see that the login button is inside a form with the ID login-form. So now, we can create a more stable selector. So, we could look for the login button in our login function in this way:

const loginBtn = await this.page.$('#login-form [type=submit]');

Now we have everything we need to test our login page. But we are not going to interact with the login page yet. Let's go to the home page and find some more complex scenarios:

Home Page

Home Page

Let's say we want to test that the Macbook Pro 13.3' Retina MF841LL/A product has 15 items left in stock, and the price is $1,199.

First, a piece of advice: It's better to code these kinds of tests down the testing pyramid. You could test the API that sends those values or the function that makes that query to the database.

But let's try to solve this as a UI test:

Product HTML

Product HTML

If we take a look at the HTML, there is nothing that helps us find the product on the list, and if we were able to find the product, it's hard to find the elements inside that div element.

Here is where the collaboration between the development team and the QA team becomes valuable. How can developers help the QA team? Using data- attributes. Your team can use a data-test- attribute to help you find the elements you need:

HTML with data-test attributes

HTML with data-test attributes

As you can see in this HTML, it will be way easier to find elements with those new attributes. This is how we can get the values to test product ID 2:

const productId = config.productToTestId;

const productDiv = await this.page.$(`[data-test-product-id="${productId}"]`);

const stockElement = await productDiv.$('[data-test-stock]');

const priceElement = await productDiv.$('[data-test-price]');

With these four lines, we were able to find the three elements for our new test: The product container and the elements containing the stock and the price.

The are a few things to notice in this piece of code:

  • First, remember not to hardcode values in your code. That's why we are going to grab the product ID from our config file.
  • Second, notice that we are getting stockElement and priceElement using productDiv.$ instead of page.$. That means that the CSS selector you pass to that function will be processed in the element's context.

    If we'd used page.$$('[data-test-stock]'), we would get many elements because each product has a data-test-stock element, but as we use productDiv.$('[data-test-stock]'), we'll get the element inside productDiv. This is an important resource.

  • The last thing to highlight here is that our development team gave us the number of items in stock inside the data-test-stock element. This will come in handy when we need to test the stock but notice that we don't need to use the value of the attribute, in this case, 15, to get the element. Passing the attribute as a selector will be enough.

What if we don't have the chance to add these attributes? There is one more resource – trying to find those elements using XPath.

Finding elements using XPath

XPath is a language to query XML-like documents. Remember how we said that HTML was a relaxed kind of XML? This means that we could navigate through the DOM using some kind of XML query language such as XPath.

Before digging into XPath's selectors, if you want to try XPath queries, Chrome DevTools includes a set of functions you can use inside the developer tools Console tab (https://hardkoded.com/ui-testing-with-puppeteer/console). One of these functions is $x, which expects an XPath expression and returns an array of elements:

Testing XPath inside the Chrome Developer Tools

Testing XPath inside the Chrome Developer Tools

If you open the Console tab on any page, you can run $x('//*') to test the //* selector.

To better understand an XPath expression, you need to see your HTML as XML content. We are going to navigate this XML document from the very same root, the HTML attribute.

Select from the current node

Selector: //. This means "From the current node, bring me everything inside, no matter the position."

Example: $x('//div//a') will return, from the root, all the divs inside the document, no matter the position, and from those divs all a elements inside that div, no matter the position.

Are you confused about the "no matter the position" part? Well, let's now see the root selector.

Select from the root

Selector: /. This means "From the current node, bring me all the direct child elements."

Example: If we use $x('/div//a'), we'll get no results because there is no div as a child of the root object. The only valid root option would be $x('/HTML') because the HTML element is the only one under the main root object. But we could do something such as $x('//div/a'), which would mean "Bring me all the div elements, and from there all the a elements that are a direct child of those divs."

Select all the elements

Selector: *. This means "Bring me all the elements."

Example: When we say "all the elements," it will be based on the previous selector. $x('/*') will bring only the HTML element because that would mean "all the direct elements." But $x('//*') will bring you all the elements from the page.

Filter by attribute

Selector: [@attributeName=value].

Example: $x('//div[@class="card-body"]') will bring all the div elements where the class attribute is equal to card-body. This might look similar to the class selector in CSS, but it's not because this selector won't work if div has more than one class.

Up to this point, it seems just like CSS with another syntax. What's so powerful about XPath? Well, let's get to some power tools.

It turns out that the syntax we used to filter attributes is, in fact, expressions, also called predicates. This gives us the chance to not only use the @attributeName option but to also check for many other things.

Filter by text

Selector: [text()=value].

Example: $x('//div[text()="Admin Panel (Testing purpose)"]') will bring all the div elements where its content is a the text Admin Panel (Testing purpose). You could even make it more generic and use something like this, $x('//*[text()="Admin Panel (Testing purpose)"]'), so you wouldn't care whether it's a div or another type of element.

This function is by far one of the main reasons you would see people using XPath.

Contains a text

Selector: [contains(text(), value)].

Example: Filter by text can be tricky. The text could have some space before or after the content. If you try to select the grid button on the page using this command, $x('//*[text()= "Grid"]'), you won't get any results because the element has some spaces after and before the word. This contains function can help us when we have spaces before or after the word, or when the word is part of a larger piece of text. This is how we can use this function: $x('//*[contains(text(),"Grid")]').

There are many more functions. Mozilla has a good list of all the available functions (https://www.hardkoded.com/ui-testing-with-puppeteer/xpath).

We get to do really complex queries with XPath. Let's take a look our last example. We want all the elements with a price over $2,000:

$x('//div[@class="row"]/p[1][number(substring-after(text(), "$")) > 2000]')

Wow, let's see what we are doing there:

  • With //div[@class="row"], we grab DIVs with the row class.
  • With p[1], we take the first p element. We can use positional filters here.
  • We get the text using text().
  • As the price begins with a dollar sign, we remove it using substring-after.
  • We convert that text into a number using number.
  • So then, we can check whether that number is greater than 2,000.

There is one more feature that makes XPath a powerful tool. Unlike CSS selectors, you can select the parent element with XPath using ...

If we want to return the entire main div of the product with a price over $2,000, we can use the following:

$x('//div[@class="row"]/p[1][number(substring-after(text(), "$")) > 2000]/../..')

How do we use XPath expressions in Puppeteer? You already know how to do it: We have a $x function.

Let's go back to our test: We want to test that the Macbook Pro 13.3' Retina MF841LL/A has 15 items left in stock, and the price is $1,199.

What if the only way to find that product would be with the product name? We could do something like this:

const productName = config.productToTestName;

const productDiv = (await this.page.$x(`//a[text()="${productName}"]/../..`))[0];

const stockElement = (await productDiv.$('//h6'))[0];

const priceElement = (await productDiv.$(' //div[@class="row"]/p[1]'))[0];

Remember that $x returns an array of elements. In this case, as we know that they will always return one element, we take the first one.

In the same way, we shouldn't rely on design classes for CSS selectors. We should try not to rely too much on the HTML structure in XPath selectors. We are assuming a couple of things in this code:

  • We assume that the stock is an h6 element.
  • We assume that the price will be the first p element.

If the design team decides that the stock will look better using div instead of h6, if they wrapped the price inside a div element to improve mobile navigation, your test will break.

We learned how to get elements from the page, but it's important to know that the $, $$, and $x functions don't return an element from the DOM. They return something called element handles.

Element handles are a reference to a DOM element on the page. They are a pointer that helps Puppeteer send commands to the browser, referencing an existing DOM element. They are also one of the ways we have to interact with those elements.

Interacting with Elements

Let's go back to our login test. We already have the three elements we need: The user input, the password input, and the login button. Now we need to enter the email and the password and click on the button.

Typing on input elements

The ElementHandle class has a function called type. The signature is type(text, [options]). The options class is not big this time. It only has a delay property. The delay is the number of milliseconds Puppeteer will wait between letters. This is great to emulate real user interaction.

The first part of our test would look like this:

const emailInput = await this.page.$('#email');

await emailInput.type(user, {delay: 100});

const passwordInput = await this.page.$('#password');

await passwordInput.type(password, {delay: 100});

Here, we are looking for the email and password elements, and then emulating a user typing on those inputs.

Now, we need to click on the button.

Clicking on elements

The ElementHandle class also has a function called click. I bet you are already getting the pattern. The signature is click([options]). You can simply call click(), and that would do the job. But we can also use the three available options:

  • button: This is a string with three valid options: "left," "right," or "middle."
  • clickCount: The default is 1, but you could also have an impatient user clicking the same button many times, so you can emulate the user clicking on the element four times by passing 4.
  • delay: This delay is not the time between clicks but the time (in milliseconds) between the mouse down action and mouse up.

In our case, we don't need to use these options:

const loginBtn = await this.page.$('#login-form [type=submit]');

await loginBtn.click();

With these two lines, we can finally finish our login function. We find the login button and then we click on it.

Selecting options in drop-down lists

The site now has a drop-down list, a SELECT element in HTML, to switch between the grid and the list view:

The site with a new switch option

The site with a new switch option

As you might have guessed, the function to select an option is called select, and the signature is select(…values). It's a list of values if the select element has the multiple attribute.

The next thing we need to know about this function is that the value select expects is not the text you see in the option, but the option of the value. We can see that by inspecting the element:

Drop-down list options

Drop-down list options

In this case, we are lucky as the value is almost the same as the visible text, but it's not the same. If we want to select the Grid item, we need to use grid, instead of Grid.

If we switch the option to list mode, we can see that a list-group-item class is added to the elements:

HTML in list mode

HTML in list mode

This is how we can test this functionality:

var switchSelect = await page.$('#viewMode');

await switchSelect.select('list');

expect(await page.$$('.list-group-item')).not.to.be.empty;

await switchSelect.select('grid');

expect(await page.$$('.list-group-item')).to.be.empty;

Using await and page.$ every time we need to interact with an element requires a lot of boilerplate. Imagine if we had eight inputs to fill; that would be a lot. That's why both Page and Frame (if you are dealing with child frames) have most of the functions an element handle has, but they expect a selector as a first argument.

So, say we have this piece of code:

var switchSelect = await page.$('#viewMode');

await switchSelect.select('list');

It could be as simple as this:

await page.select('#viewMode', 'list');

You will find functions such as page.click(selector, [options]), page.type(selector, text, [options]), and many other interaction functions.

We have covered the most common user interactions. But we can go a little deeper and try to emulate how the user would interact with the page using their keyboard and mouse.

Keyboard and Mouse emulation

Although you will be able to test the most common scenarios by typing or clicking on elements, there are other scenarios where you would need to emulate how the users interact with a site using the keyboard and the mouse. Let's take, for instance, a Google spreadsheet:

Google Spreadsheet

Google Spreadsheet

The Google spreadsheet page has a lot of keyboard and mouse interactions. You can move through the cells using your keyboard arrows or copy values by doing drag and drop with the mouse.

But it doesn't need to be that complicated. Let's say that you work in the QA team at GitHub.com, and you need to test the search box from the home page.

As GitHub.com is for developers, and developers for some weird reason hate using the mouse, the development team added many shortcuts on the site. We want to create a test to check that those shortcuts are working as expected:

GitHub.com home page

GitHub.com home page

As we can see there, the shortcut to the search input is a /. So, we need to do the following:

  • Press slash.
  • Type the repo name.
  • And then press Enter.

We are going to use the Keyboard class that the Page class exposes as a property.

The first step is to press slash. To do that, we are going to use, you guessed it, the press function. The signature is press(key, options). The first thing we need to know about press is that it's a shortcut to two other functions – down(key, options) and up(key). As you can see, you can get an almost complete keyboard emulation.

Notice that the first argument is not text but key. You will find the full list of supported keys here: https://www.hardkoded.com/ui-testing-with-puppeteer/USKeyboardLayout. There, you will find keys such as Enter, Backspace, or Shift. The press function has two options available: First, if you assign the text property, Puppeteer will create an input event with that value. It would work like a macro. For instance, if the key is p and the text is puppeteer, when you would press p, you would get puppeteer in the input element. I've never found a usage for that argument, but it's there. The down function also has this option. The second option is delay, which is the time between the key down and the key up actions.

The official Puppeteer documentation (https://www.hardkoded.com/ui-testing-with-puppeteer/keyboard) has a perfect example for this:

await page.keyboard.type('Hello World!');

await page.keyboard.press('ArrowLeft');

await page.keyboard.down('Shift');

for (let i = 0; i < ' World'.length; i++) {

  await page.keyboard.press('ArrowLeft');

}

await page.keyboard.up('Shift');

await page.keyboard.press('Backspace');

Let's unpack this code:

  • It types Hello World!. The cursor is after the exclamation mark.
  • It presses the left arrow key. Remember, press is key down and key up. So now the cursor is before the exclamation mark.
  • Then, using down, it presses the Shift key, but it doesn't release the key.
  • Then, it presses the left key as many times for the cursor to get to after the "Hello" word. But as the Shift key is still pressed, the "World" text got selected.
  • Then, it releases the Shift key, using up.
  • And what happens when you press backspace and we have text selected? You remove the entire selection, leaving the text Hello!.

Now we can go and test the GitHub.com home page:

const browser = await puppeteer.launch({headless: false, defaultViewport: null});

const page = await browser.newPage();

await page.goto('https://www.github.com/');

await page.keyboard.press('Slash');

await page.keyboard.type('puppeteer')

await page.keyboard.press('Enter');

If we go back to our login example, we could test that you should be able to log in by pressing Enter instead of clicking on the login button. Or if the navigation between controls is important, you can jump from the user input to the password and then to the login button by pressing Tab.

Do you want to play tic-tac-toe? Let's play it using the mouse.

In the Chapter4 folder, you will find a tictactoe.html file with a small tic-tac-toe game made in React:

Tic-tac-toe game

Tic-tac-toe game

If we consider the page as a canvas, where the top-left corner of the window is the coordinate (0;0) and the bottom right is the coordinate (window width, window height), mouse interaction is about moving the mouse to an (X;Y) coordinate and clicking using one of the mouse buttons. Puppeteer offers the following functionalities.

Move the mouse using mouse.move(x, y, [options]). The only option available in this move function is steps. With steps, you can tell Puppeteer how many times you want to send mousemove events to the page. By default, it will send only one event at the end of the mouse move action.

In the same way as with the keyboard you have the up/down and press functions, with the mouse, you have up/down and click.

The mouse has one extra action that the keyboard doesn't have, which is wheel. You can emulate mouse scrolling using mouse.wheel([options]). This option has two properties: deltaX and deltaY, which can be positive or negative scroll values expressed in CSS pixels.

Let's go back to our tic-tac-toe game. We will do a simple test: Player 1 will use the first row and player 2 will use the second row, so player 1 will win after three moves. As this is a canvas, we need to know which coordinates we need to click.

We can use the style section of the developer tools to get those coordinates. If we look at the body, we will see a 20-pixel margin that will make (20;20) the starting point:

Body margin

Body margin

We also know that each square is 32 px by 32 px, so the middle of the square should be delta + (32 / 2). Let's test it:

const startingX = 20;

const startingY = 20;

const boxMiddle = 16;

// X turn 1;

await page.mouse.click(startingX + boxMiddle, startingY + boxMiddle);

// Y turn 1;

await page.mouse.click(startingX + boxMiddle, startingY + boxMiddle * 3);

// X turn 2;

await page.mouse.click(startingX + boxMiddle * 3, startingY + boxMiddle);

// Y turn 2;

await page.mouse.click(startingX + boxMiddle * 3, startingY + boxMiddle * 3);

// X turn 3;

await page.mouse.click(startingX + boxMiddle * 5, startingY + boxMiddle);

expect(await page.$eval('#status', status => status.innerHTML)).to.be('Winner: X');

So, here we know that the tic-tac-toe grid starts at the coordinate (20,20), and from there is simple math to find the right coordinates in our canvas. The first box will be clicked at the coordinate (startingX + boxMiddle; startingY + boxMiddle). If we want to click on the second row, it would be three middle squares, startingX + boxMiddle * 3, and so on until we know that we have a winner.

Don't worry about the last $eval. We'll get there.

But this is not just for games. Many modern UIs might require some mouse interactions, for instance, hoverable dropdowns or menus. We can see one example on the W3Schools site (https://www.w3schools.com/howto/howto_css_dropdown.asp):

Hoverable dropdown

Hoverable dropdown

To be able to click on any item in that dropdown, we need to hover first on the button and then link on the option:

await page.goto("https://www.w3schools.com/howto/howto_css_dropdown.asp");

const btn = await page.$(".dropbtn");

const  box = await btn.boundingBox();

await page.mouse.move(box.x + (box.width / 2), box.y + (box.height / 2));

const  option = (await page.$x('//*[text()="Link 2"]'))[0];

await option.click();

As you can see, we don't need to guess the Hover me button's location. The element handle provides a function called boundingBox, which returns the position (x and y) and the element's size (width and height).

Is there an easier way? Yes, we can simply use await btn.hover(), which would hover on the element. I wanted to give you a complete example because sometimes UI components are quite sensitive to the mouse position, so you need to put the mouse in a precise location to get the desired result.

Time for a bonus track. Let's talk about debugging.

Debugging tests with Visual Studio Code

Many developers consider debugging a last resort. Others would flood their code with console.log messages. I consider debugging a productivity tool.

Debugging is trying to find bugs by running an application step by step.

We have two ways of launching our tests in debug mode. The first option is creating a JavaScript debug terminal from the Terminal tab. That will create a new terminal as we did before, but in this case, Visual Studio will enable the debugger when you run a command from that terminal:

Debugging from the terminal

Debugging from the terminal

The second option is going to the Run tab and creating a launch.json file. You could also create that file manually inside the .vscode folder:

Create a launch.json from the run tab

Create a launch.json from the run tab

Once we have the file, we can create a new configuration so that we can run npm run test in the terminal:

{

    "version": "0.2.0",

    "configurations": [

        {

            "name": "Test",

            "request": "launch",

            "runtimeArgs": [

                "run",

                "test"

            ],

            "runtimeExecutable": "npm",

            "skipFiles": [

                "<node_internals>/**"

            ],

            "type": "pwa-node"

        },

    ]

}

Which one is the best? Well, if you will work on this project for many days, creating the launch.json file is more productive; once created, you just need to hit F5, and you would be in debug mode. The terminal option is easier just to get running.

Once you have everything set up, it is about creating breakpoints in the line you want the debugger to stop, and from there it is about taking advantage of all the tools Visual Studio Code offers:

Visual Studio Code in debugging mode

Visual Studio Code in debugging mode

There you will find the following:

  • At the left of the line numbers, you will find the breakpoints. You can create or remove breakpoints by clicking at the left of the line number.
  • You will find the full list of breakpoints at the bottom left of the window. From there, you will be able to disable breakpoints temporarily.
  • At the top right of the window, you will find debug actions: Pause, play, step in/out, and stop buttons.
  • In the left panel, you will find two useful sections: Variables, where you can automatically get the values of all the variables in the current scope. The next panel is Watch, and you can add there the variables or expressions you want to look at while running your code.

Summary

This chapter was massive. We began the chapter with a brief but complete introduction to HTML, the DOM, and CSS. These concepts are crucial to create top-notch tests. Then, we learned a lot about XPath, which is not a very popular tool, yet it is extremely powerful and will help you face scenarios where CSS selectors are not enough.

In the second part of this chapter, we went through the most common ways to interact with a page. Not only did we learn how to interact with elements but we also covered keyboard and mouse emulation.

I hope you enjoyed the tools section. Debugging with Visual Studio Code is a great tool to add to your toolbox.

In the next chapter, we are going to wait for stuff. Things take time on the web. Pages take time to load. Some actions on the page might trigger network calls. The next chapter is important because you will learn how to make your tests even more stable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.205.114.205