Parsing the DOM to extract pricing data

The DOM is the collection of elements that comprise a web page. It includes HTML tags such as body and div, as well as the classes and IDs embedded within these tags.

Let's take a look at the DOM for our Google page:

  1. To see it, right-click on the page and click on Inspect. This should be the same for Firefox or Chrome. This will open the developer tab that allows you to see the page source information, as demonstrated in the following screenshot:

  1. Once this is open, choose the element selector in the upper left-hand corner, and click on an element to jump to that element in the page source code:

  1. The element that we are concerned with is the box that contains the flight information. This can be seen in the following screenshot:

If you look closely at the element, you will notice that it is an element called a div. This div has an attribute called class. The is a long string of random numbers and letters in this class, but you will also notice that it contains the string info-container. We can use this information to retrieve all the div elements that have flight information for each city. We'll do that in a minute, but for now, let's discuss the parsing process.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.133.160