Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Web scraping using lxml

In this section, we will utilize most of the techniques and concepts learned throughout the chapters so far and implement some scraping tasks.

For the task ahead, we will first select the URLs required. In this case, it will be http://books.toscrape.com/, but by targeting a music category, which is http://books.toscrape.com/catalogue/category/books/music_14/index.html. With the chosen target URL, its time now to explore the web page and identify the content that we are willing to extract.

We are willing to collect certain information such as the title, price, availability, imageUrl, and rating found for each individual item (that is, the Article element) listed in the page. We will attempt different techniques using lxml and XPath to scrape data from single and multiple pages, plus the use of CSS selectors.

Regarding element identification, XPath, CSS selectors and using DevTools, please refer to the Using web browser developer tools for accessing web content section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

18.225.55.198

Table of Contents for Web scraping using lxml

Create new playlist

Sign In

Sign Up

Table of Contents for
Web scraping using lxml