MNIST &#x2013; getting data

The MNIST dataset contains 60,000 handwritten digits from 0 to 9 for training, and 10,000 images for a test set. The PyTorch torchvision library provides us with an MNIST dataset, which downloads the data and provides it in a readily-usable format. Let's use the dataset MNIST function to pull the dataset to our local machine, and then wrap it around a DataLoader. We will use torchvision transformations to convert the data into PyTorch tensors and do data normalization. The following code takes care of downloading, wrapping around the DataLoader and normalizing the data:

transformation = 
  transforms.Compose([transforms.ToTensor(),
  transforms.Normalize((0.1307,), (0.3081,))])

train_dataset = 
  datasets.MNIST('data/',train=True,transform=transformation,
    download=True)
test_dataset =  
  datasets.MNIST('data/',train=False,transform=transformation,
    download=True)

train_loader =   
  torch.utils.data.DataLoader(train_dataset,batch_size=32,shuffle=True)
test_loader =  
  torch.utils.data.DataLoader(test_dataset,batch_size=32,shuffle=True)

So, the previous code provides us with a DataLoader for the train and test datasets. Let's visualize a few images to get an understanding of what we are dealing with. The following code will help us in visualizing the MNIST images:

def plot_img(image):
    image = image.numpy()[0]
    mean = 0.1307
    std = 0.3081
    image = ((mean * image) + std)
    plt.imshow(image,cmap='gray')

Now we can pass the plot_img method to visualize our dataset. We will pull a batch of records from the DataLoader using the following code, and plot the images:

sample_data = next(iter(train_loader))
plot_img(sample_data[0][1])
plot_img(sample_data[0][2])

The images are visualized as follows:

Table of Contents for
MNIST – getting data

MNIST – getting data

Table of Contents for MNIST &#x2013; getting data

Create new playlist

Sign In

Sign Up

Table of Contents for
MNIST – getting data