Building a feedforward neural network to recognize handwritten digits, version two

In the previous section, we built a very simple neural network with just an input and output layer. This simple neural network gave us an accuracy of 86%. Let's see if we can improve this accuracy further by building a neural network that is a little deeper than the previous version:

Let's do this on a new notebook. Loading the dataset and data pre-processing will be the same as in the previous section:

import numpy as np
np.random.seed(42)
import keras
from keras.datasets import mnist 
from keras.models import Sequential 
from keras.layers import Dense
from keras.optimizers import SG
#loading and pre-processing data
(X_train,y_train), (X_test,y_test)= mnist.load_data()
X_train= X_train.reshape( 60000, 784). astype('float32')
X_test =X_test.reshape(10000,784).astype('float32')
X_train/=255
X_test/=255

The design of the neural network is slightly different from the previous version. We will add a hidden layer with 64 neurons to the network, along with the input and output layers:

model=Sequential()
model.add(Dense(64,activation='relu', input_shape=(784,)))
model.add(Dense(64,activation='relu'))
model.add(Dense(10,activation='softmax'))

Also, we will use the relu activation function for the input and hidden layer instead of the sigmoid function we used previously.

We can inspect the model design and architecture as follows:

model.summary()
_______________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 64)                50240     
_______________________________________________________________
dense_2 (Dense)              (None, 64)                4160      
_______________________________________________________________
dense_3 (Dense)              (None, 10)                650       
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________

Next, we will configure the model to use the derivative categorical_crossentropy cost function rather than MSE. Also, the learning rate is increased from 0.01 to 0.1:

model.compile(loss='categorical_crossentropy',optimizer=SGD(lr=0.1), 
metrics =['accuracy'])

Now, we will train the model, like we did in the previous examples:

model.fit(X_train,y_train,batch_size=128,epochs=200,verbose=1,validation_data =(X_test,y_test))

Train on 60,000 samples and validate on 10,000 samples:

Epoch 1/200
60000/60000 [==============================] - 1s - loss: 0.4785 - acc: 0.8642 - val_loss: 0.2507 - val_acc: 0.9255
Epoch 2/200
60000/60000 [==============================] - 1s - loss: 0.2245 - acc: 0.9354 - val_loss: 0.1930 - val_acc: 0.9436
.
.
.
60000/60000 [==============================] - 1s - loss: 4.8932e-04 - acc: 1.0000 - val_loss: 0.1241 - val_acc: 0.9774
<keras.callbacks.History at 0x7f3096adadd8>

As you can see, there is an increase in accuracy compared to the model we built in the first version.

Table of Contents for Building a feedforward neural network to recognize handwritten digits, version two

Create new playlist

Sign In

Sign Up

Table of Contents for
Building a feedforward neural network to recognize handwritten digits, version two