Key web scraping tasks

While pulling the required data from a semistructured document, we perform various tasks. The following are the basic tasks that we adopt for scraping:

  • Searching a semistructured document: Accessing a particular element or a specific type of element in a document can be accomplished using its tag name and tag attributes, such as id, class, and so on.
  • Navigating within a semistructured document: We can navigate through a web document to pull different types of data in four ways, which are navigating down, navigating sideways, navigating up, and navigating back and forth. We can get to know more about these in detail later in this chapter.
  • Modifying a semistructured document: By modifying the tag name or the tag attributes of a document, we can streamline and pull the required data.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.236.70