How it works...

Let's walk through this code piece by piece to understand everything we to do in order to work with this data. As always, we start with boilerplate items for coding in Python. The first line establishes which Python interpreter we should use:

#!/usr/bin/env python

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data

Next, we grab the data from the TensorFlow examples library:

# Read from this directory
mnist = input_data.read_data_sets("MNIST_data/", one_hot=False)

There's an option for one-hot encoding in this import function. For this particular example, we will simply leave one-hot false.

One-hot encoding: Categorical variables need a simple way to encode the values into a numerical space. One-hot encoding represents the process of mapping categorical variables to an integer value and then a binary vector representation. One-hot encoding is simple in SciPy as there are one line methods for encoding these variable types with ease.

For every dataset, I recommend checking the shape of the data to ensure that you've got the data you expect:

# Look at the shape of the images training data:
print("Shape of the Image Training Data is "+str(mnist.train.images.shape))

# Look at the shape of the labels training data:
print("Shape of the Label Training Data is "+str(mnist.train.labels.shape))

This is the output you should see when running this snippet:

Shape of the Image Training Data is (55000, 784)
Shape of the Label Training Data is (55000,)   # (if One-Hot False)
Shape of the Label Training Data is (55000,10) # (if One-Hot True

Next, let's have a look at a sample image from the dataset:

# Take a Random Example from the DatasetS:
index = np.random.choice(mnist.train.images.shape[0], 1)
random_image = mnist.train.images[index]
random_label = mnist.train.labels[index]
random_image = random_image.reshape([28, 28]);

# Plot the Image
plt.gray()
plt.imshow(random_image)
plt.show()

This is fun because we actually get to see the content of the data. You'll notice that we needed to reshape the array from a single dimension of 784 to a 28 x 28 image. TensorFlow will keep images in different formats depending on the technique to be used. This is why I keep saying we should visualize the data to ensure we understand the information we are working with. We've now got a full setup to run the Python code. Let's use our development environment to run this file.

In the repository, there will always be a build and run script. The build script, as you remember, will build the Dockerfile and the run script will run the environment. With the current setup, you will need to build in order for new changes to take effect. With the Docker run file, it's possible to map a folder on your computer to the Docker image also.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...