Creating an item

Proceeding with the scraping task and the project folder, we will find a file named item.py or item, containing the Python class QuotesItem. The item is also automatically generated by Scrapy while issuing the scrapy startproject Quotes command. The QuotesItem class inherits the scrapy.Item for built-in properties and methods such as the Field. The Item or QuotesItem in Scrapy represents a container for collecting values and the Fields listed as shown in the following code, including quotes, tags, and so on, which will acts as the keys to the values which we will obtain using the parse() function. Values for the same fields will be extracted and collected across the found pages.

The item is accessed as a Python dictionary with the provided fields as keys with their values extracted. It's effective to declare the fields in the item and use them in Spider but is not compulsory to use item.py as shown in the following example:

class QuotesItem(scrapy.Item):
# define the fields for your item here like:
# name = scrapy.Field()

quote = scrapy.Field()
tags = scrapy.Field()
author = scrapy.Field()
author_link = scrapy.Field()

pass

We need to import the QuotesItem when the item is required inside the Spider, as seen in the following code, and process it by creating an object and accessing the declared fields, that is, quote, tags, author, and so on: 

#inside Spider 'quotes.py'
from Quotes.items import QuotesItem
....
#inside parse()
item = QuotesItem() #create an object 'item' and access the fields declared.

item['quote'] = .......
item['tags'] = .......
item['author'] = ......
item['author_link'] = ......
......

In this section, we declared the item fields that we are willing to retrieve data from a website. In the upcoming section, we will explore different methods of data extraction and link them to the item fields.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.197.93