Finding elements using XPath

XPath (the XML path language) is a query language used to select nodes from an XML document. All the major browsers implement DOM Level 3 XPath (using http://www.w3.org/TR/DOM-Level-3-XPath/) specification, which provides access to a DOM tree.

The XPath language is based on a tree representation of the XML document and provides the ability to navigate around the tree and to select nodes using a variety of criteria.

Selenium WebDriver supports XPath to locate elements using XPath expressions, also known as XPath query.

One of the important differences between XPath and CSS is that, with XPath, we can search elements backwards or forwards in the DOM hierarchy, while CSS works only in a forward direction. This means that using XPath we can locate a parent element using a child element and vice versa.

In this recipe, we will explore some basic XPath queries to locate elements, and then examine some advanced XPath queries.

XML documents are treated as trees of nodes. The topmost element of the tree is called the root element. When an HTML document is loaded in DOM, it provides a similar tree of nodes. Here's an example of an HTML page:

<html>
  <head>
    <title>My Book List</title>
  </head>
<body>
  <h1>My Book List</h1>
  <div>
  <table class="main-list">
   <tr>
    <td>Title</td>
    <td>Author</td>
    <td>Publication Year</td>
    <td>Price</td>
    <td>Book Page</td>
   </tr>
   <tr id="book_1">
    <td>XML Developer's Guide</td>
    <td>Gambardella, Matthew</td>
    <td>Publication Year</td>
    <td class="price">44.95</td>
    <td><div class="desc">An in-depth look at creating applications
     with XML.</div></td>
    <td><a href="/book_1.html">
     <img src="/img/book1_png/" alt="XML Developers Guide">
     </a></td>
   </tr>
  </table>
 </div>
</body>
</html>

Let's understand some basic XPath terminology before we move on to using XPath, with the following listed terms. We will use the previous HTML document as an example:

Term

Description

Nodes

DOM represents an HTML document as trees of nodes. Here are examples of nodes from the previous HTML document:

  • html: This is the root element node
  • title: This is the element node
  • id="b00k_1": This represents the attributes and values

The topmost element of the tree is called the root node or element.

Atomic Values

Atomic values are nodes with no children or parents. For example:

Gambardella, Matthew

XML Developer's Guide

44.95

Parents

Each element and attribute has one parent. For example, the body element is the parent of div. Similarly, div is the parent of the table element.

Children

Element nodes may have zero, one, or more children. For example, there are two tr elements, which are children of the table element.

Siblings

Nodes that have the same parent. For example, h1 and div are all siblings and their parent is the bod y element.

Ancestors

A node's parent, parent's parent, and so on. For example, ancestors of the table element are div, body and html.

Descendants

A node's children, children's children, and so on. For example, the descendants of the table element are tr, td and div.

Selecting nodes

XPath uses path expressions to select nodes from the tree. The node is selected by following a path or steps. The most useful path expressions are listed as follows:

Expression

Description

nodename

This will select all nodes with the name "nodename". For example,

table will select all the table elements.

/ (slash)

This will select element(s) relative to the root element. For example:

  • /html: This will select the root HTML element. A slash (/) is used in the beginning and it defines an absolute path.
  • html/body/table: will select all table elements that are children of HTML.

The slash (/) is used at the start of a code element, and it defines an absolute path. It defines ancestor and descendant relationships if used in the middle; for example, //div/table returns the div containing a table object.

// (double slash)

This will select node(s) in the document from the current node that match the selection irrespective of its position. For example:

  • //table will select all the table elements no matter where they are in the document
  • //tr//td will select all the td elements
  • //a//img will select all the img elements that are children of the "a" (anchor) element

Double slash (//) defines a descendant relationship if used in the middle; for example, /html//title returns the title element that is descendant of the html element.

. (dot)

This represents the current node.

.. (double dot)

This will select the parent of the current node. For example, //table/.. will return the div element.

@

This represents an attribute. For example:

  • //@id: This will select all the elements where the id attribute are defined no matter where they are in the document
  • //img/@alt: This will select all the img elements where the @alt attribute is defined

How to do it...

Let's explore some basic XPath expressions that can be used in Selenium WebDriver. Selenium WebDriver provides the xpath() method to locate elements using XPaths.

Finding elements with an absolute path

XPath absolute paths refer to the very specific location of the element, considering its complete hierarchy in the DOM. Here is an example where the Username Input field is located using the absolute path. When providing an absolute path, a space is given between the elements:

WebElement userName = driver.findElement(By.xpath("/html/body/div/div/form/input"));

However, this strategy has limitations as it depends on the structure or hierarchy of the elements on a page. If this changes, the locator will fail to get the element.

Finding elements with a relative path

With a relative path, we can locate an element directly irrespective of its location in the DOM. For example, we can locate the Username Input field in the following way, assuming it is the first <input> element in the DOM:

WebElement userName = driver.findElement(By.xpath("//input"));

Finding elements using predicates

A predicate is embedded in square brackets and is used to find out specific node(s) or a node that contains a specific value.

In the previous example, the XPath query will return the first <input> element that it finds in the DOM. There could be multiple elements matching the specified XPath query. If the element is not the first element, we can also locate the element by using its index in the DOM. For example, in our login form, we can locate the Password field, which is the second <input> element on the page, in the following way:

WebElement userName = driver.findElement(By.xpath("//input[2]"));

Finding elements using attributes values with XPath

We can find elements using their attribute values in XPath. In the following example, the Username field is identified using the ID attribute:

WebElement userName = driver.findElement(By.xpath("//input[@id='username']"));

Here is another example where the image is located using the alt attribute:

WebElement previousButton = driver.findElement(By.xpath("//img[@alt='Previous']"));

You might come across situations where one attribute may not be sufficient to locate an element and you need combined additional attributes for a precise match. In the following example, multiple attributes are used to locate the <input> element for the Login button:

WebElement previousButton = driver.findElement(By.xpath("//input[@type='submit'][@value='Login']"));

The same result can be achieved by using XPath and operator:

WebElement previousButton = driver.findElement(By.xpath("//input[@type='submit' and @value='Login']"));

In the following example, either of the attributes is used to locate the elements using XPath or operator:

WebElement previousButton = driver.findElement(By.xpath("//input[@type='submit'or @value='Login']"));

Finding elements using attributes with XPath

This strategy is a bit different from the earlier strategy where we want to find elements based only on the specific attribute defined for them but not attribute values. For example, we want to lookup all the <img> elements that have the alt attribute specified:

List<WebElement> imagesWithAlt = driver.findElements(By.xpath ("//img[@alt]"));

Here's another example where all the <img> elements will be searched and where the alt attribute is not defined. We will use the not function to check the negative condition:

List<WebElement> imagesWithAlt = driver.findElements(By.xpath ("//img[not(@alt)]"));

Performing partial match on attribute values XPath also provides a way to find elements matching partial attribute values using XPath functions. This is very useful to test applications where attribute values are dynamically assigned and change every time a page is requested. For example, ASP.NET applications exhibit this kind of behavior where IDs are generated dynamically.

The following table explains the use of these XPath functions:

Syntax

Example

Description

starts-with()

input[starts-with(@id,'ctrl')]

Starting with:

For example, if the ID of an element is ctrl_12, this will find and return elements with ctrl at the beginning of the ID.

ends-with()

input[ends-with(@id,'_userName')]

Ending with:

For example, if the ID of an element is a_1_userName, this will find and return elements with _userName at the end of the ID.

contains()

Input[contains(@id,'userName')]

Containing:

For example, if the ID for an element is panel_login_userName_textfield, this will use the userName part in the middle to match and locate the element.

Matching any attribute using a value

XPath matches the attribute for all the elements for a specified value and returns the element. For example, in the following XPath query, 'userName' is specified. XPath will check all the elements and their attributes to see if they have this value and return the matching element.

WebElement userName = driver.findElement(By.xpath("//input[@*='username']"));

Here are more examples of using XPath predicates to find elements using their position and contents:

Expression

Description

/table/tr[1]

This will select the first tr (row) element that is the child of the table element.

/table/tr[last()]

This will select the last tr (row) element that is the child of the table element.

/table/tr[last()-1]

This will select the second last tr (row) element that is the child of the table element.

/table/tr[position()>4]

This will select the three tr (rows) elements that are child of the table element.

//tr[td>40]

This will select all the tr (rows) elements that have one of their children td with value greater than 40.

Selecting unknown nodes

Apart from selecting the specific nodes, XPath also provides wildcards to select a group of elements:

Wildcard

Description

Example

*

Matches any element node.

  • /table/*: This will select all child elements of a table element
  • //*: This will select all elements in the document
  • //*[@class='price']: This will select any element in the document which has an attribute named class with a specified value, that is price

@

Matches any attribute node.

  • //td[@*]: This will select all the td elements that have any attribute

node()

Matches any node of any kind.

  • //table/node(): This will select all the child elements of table

Selecting several paths

Using the union | operator in XPath expressions, we can select several paths together, as shown in the following table:

Path Expression

Action

//div|/p | //div/span

This will select all the p (paragraph) and span elements of the div element.

//p | //span

This will select all the p (paragraph) and span elements in the document.

Locating elements with XPath axes

XPath axes help to find elements based on the element's relationship with other elements in a document. The following screenshot shows some examples for some common XPath axes used to find elements from a <table> element. This can be applied to any other element structure from your application.

Locating elements with XPath axes

The following image shows a graphical representation of the HTML elements:

Locating elements with XPath axes

Axis

Description

Example

Result

ancestor

Selects all ancestors (parent, grandparent, and so on) of the current node.

//td[text()='Product 1']/ancestor::table

This will get the table element.

descendant

Selects all descendants (children, grandchildren, and so on) of the current node.

/table/descendant::td/input

This will get the input element from the third column of the second row from the table.

following

Selects everything in the document after the closing tag of the current node.

//td[text()='Product 1']/following::tr

This will get the second row from the table.

following-sibling

Selects all siblings after the current node.

//td[text()='Product 1']/following-sibling::td

This will get the second column from the second row immediately after the column that has Product 1 as the text value.

preceding

Selects all nodes that appear before the current node in the document, except ancestors, attribute nodes, and namespace nodes.

//td[text()='$150']/preceding::tr

This will get the header row.

preceding-sibling

Selects all siblings before the current node.

//td[text()='$150']/preceding-sibling::td

This will get the first column of third row from the table.

You can find more about XPath axes at http://www.w3schools.com/xpath/xpath_axes.asp.

How it works...

XPath is a powerful language to query and process DOM trees in browsers. XPath is used to navigate through elements and attributes in a DOM tree. XPath provides various rules, functions, operators, and syntax to find the elements.

The majority of browsers support XPath, and Selenium WebDriver provides the ability to find elements using the XPath language.

Using the xpath() method of the By class, we can locate elements using XPath syntax.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.156.251