Scraping Using pyquery – a Python Library

Starting from this chapter, we will be exploring scraping-related tools and techniques, as we will also be deploying some scraping code. Features related to web exploration, Python libraries, element identification, and traversing are the major concepts we have learned about so far.

Web scraping is often a challenging and long process that requires an understanding of how the website is performing. A basic ability to understand and identify the backends or tools that are used to build a website will assist in any scraping task. This is also related to a process known as reverse engineering. For more information on such tools, please refer to Chapter 3, Using LXML, XPath, and CSS Selectors, and the using web browser developer tools for accessing web content section. In addition to this, identifying the tools for traversing and manipulating elements such as HTML tags is also required, and pyquery is one of them.

In the previous chapters, we explored XPath, CSS Selectors, and LXML. In this chapter, we will look into using pyquery, which has a jQuery-like ability that seems to be more efficient and, hence, easier to deal with when it comes to web scraping procedures.

In this chapter, you will learn about the following topics:

  • Introduction to pyquery
  • Exploring pyquery (major methods and attributes)
  • Using pyquery for web scraping
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.254.111