The intuition of hyperparameter tuning

In order to gain a practical intuition of the need for hyperparameter tuning, let's go through the following scenario in predicting the accuracy of a given neural network architecture on the MNIST dataset:

Scenario 1: High number of epochs and low learning rate
Scenario 2: Low number of epochs and high learning rate

Let us create the train and test datasets in a Google Cloud environment, as follows:

Download the dataset:

mkdir data
curl -O https://s3.amazonaws.com/img-datasets/mnist.pkl.gz
gzip -d mnist.pkl.gz
mv mnist.pkl data/

The preceding code creates a new folder named data, downloads the MNIST dataset, and moves it into the data folder.

Open Python in Terminal and import the required packages:

from __future__ import print_function 
import tensorflow as tf
import pickle # for handling the new data source
import numpy as np
from datetime import datetime # for filename conventions
from tensorflow.python.lib.io import file_io # for better file I/O
import sys

Import the MNIST dataset:

f = file_io.FileIO('data/mnist.pkl', mode='r')
data = pickle.load(f)

Extract the train and test datasets:

(x_train, y_train), (x_test, y_test) = data
# Converting the data from a 28 x 28 shape to 784 columns
x_train = x_train.reshape(60000, 784)
x_train = x_train.astype('float32')
# Scaling the train dataset
x_train /= 255
# Reshaping the test dataset
x_test = x_test.reshape(10000, 784)
x_test = x_test.astype('float32')
# Scaling the test dataset
x_test /= 255
# Specifying the type of labels
y_train = y_train.astype(np.int32)
y_test = y_test.astype(np.int32)

Create the estimator functions:

# Creating the estimator input functions for train and test datasets 
train_input_fn = tf.estimator.inputs.numpy_input_fn(
 x={"x2": np.array(x_train)},
 y=np.array(y_train),
 num_epochs=None,
 batch_size=1024,
 shuffle=True)
test_input_fn = tf.estimator.inputs.numpy_input_fn(
 x={"x2": np.array(x_test)},
 y=np.array(y_test),
 num_epochs=1,
 shuffle=False)

Specify the type of column:

feature_x = tf.feature_column.numeric_column("x2", shape=(784))
feature_columns = [feature_x]

Build a DNN classifier using the parameters in scenario 1; that is, the learning rate is 0.1 and the number of steps is 200:

num_hidden_units = [1000]
lr=0.1
num_steps=200
# Building the estimator using DNN classifier
# This is where the learning rate hyper parameter is passed
model = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                 hidden_units=num_hidden_units,
                 activation_fn=tf.nn.relu,
                 n_classes=10,
                 optimizer=tf.train.AdagradOptimizer(learning_rate = lr))
model.train(input_fn=train_input_fn, steps=num_steps) 
# Fetching the model results
result = model.evaluate(input_fn=test_input_fn)
print('Test loss:', result['average_loss'])
print('Test accuracy:', result['accuracy'])

The test accuracy in such a scenario comes out to be 96.49%.

In scenario 2, we will build another DNN classifier using different parameters; now the learning rate is 0.01 and the number of steps is 2000:

num_hidden_units = [1000]
lr=0.01
num_steps=2000
# Building the estimator using DNN classifier
# This is where the learning rate hyper parameter is passed
model = tf.estimator.DNNClassifier(feature_columns=feature_columns,
 hidden_units=num_hidden_units,
 activation_fn=tf.nn.relu,
 n_classes=10,
 optimizer=tf.train.AdagradOptimizer(learning_rate = lr))
model.train(input_fn=train_input_fn, steps=num_steps) 
# Fetching the model results
result = model.evaluate(input_fn=test_input_fn)
print('Test loss:', result['average_loss'])
print('Test accuracy:', result['accuracy'])

The accuracy on the test dataset in scenario 2 is nearly 98.2%.

The preceding two scenarios show us the importance of how various values of different hyperparameters affect the final result.

Google Cloud ML engine comes in handy in such scenarios, where we can be more intelligent in selecting the more optimal set of hyperparameters.

Table of Contents for The intuition of hyperparameter tuning

Create new playlist

Sign In

Sign Up

Table of Contents for
The intuition of hyperparameter tuning