Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Downloading IMDB data and performing text tokenization

For applications related to computer vision, we used the torchvision library, which provides us with a lot of utility functions, helping to building computer vision applications. In the same way, there is a library called torchtext, part of PyTorch, which is built to work with PyTorch and eases a lot of activities related to natural language processing (NLP) by providing different data loaders and abstractions for text. At the time of writing, torchtext does not come with PyTorch installation and requires a separate installation. You can run the following code in the command line of your machine to get torchtext installed:

pip install torchtext

Once it is installed, we will be able to use it. Torchtext provides two important modules called torchtext.data and torchtext.datasets.

We can download the IMDB Movies dataset from the following link:
https://www.kaggle.com/orgesleka/imdbmovies

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

3.138.34.226

Table of Contents for Downloading IMDB data and performing text tokenization

Create new playlist

Sign In

Sign Up

Table of Contents for
Downloading IMDB data and performing text tokenization