Building a feedforward neural network to recognize handwritten digits, version two

In the previous section, we built a very simple neural network with just an input and output layer. This simple neural network gave us an accuracy of 86%. Let's see if we can improve this accuracy further by building a neural network that is a little deeper than the previous version:

  1. Let's do this on a new notebook. Loading the dataset and data pre-processing will be the same as in the previous section:
import numpy as np
np.random.seed(42)
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SG
#loading and pre-processing data
(X_train,y_train), (X_test,y_test)= mnist.load_data() X_train= X_train.reshape( 60000, 784). astype('float32')
X_test =X_test.reshape(10000,784).astype('float32') X_train/=255 X_test/=255
  1. The design of the neural network is slightly different from the previous version. We will add a hidden layer with 64 neurons to the network, along with the input and output layers:
model=Sequential()
model.add(Dense(64,activation='relu', input_shape=(784,))) model.add(Dense(64,activation='relu')) model.add(Dense(10,activation='softmax'))
  1. Also, we will use the relu activation function for the input and hidden layer instead of the sigmoid function we used previously.
  1. We can inspect the model design and architecture as follows:
model.summary()
_______________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 64) 50240
_______________________________________________________________
dense_2 (Dense) (None, 64) 4160
_______________________________________________________________
dense_3 (Dense) (None, 10) 650
=================================================================
Total params: 55,050
Trainable params: 55,050
Non-trainable params: 0
_________________________________________________________________
  1. Next, we will configure the model to use the derivative categorical_crossentropy cost function rather than MSE. Also, the learning rate is increased from 0.01 to 0.1:
model.compile(loss='categorical_crossentropy',optimizer=SGD(lr=0.1), 
metrics =['accuracy'])
  1. Now, we will train the model, like we did in the previous examples:
model.fit(X_train,y_train,batch_size=128,epochs=200,verbose=1,validation_data =(X_test,y_test))
  1. Train on 60,000 samples and validate on 10,000 samples:
Epoch 1/200
60000/60000 [==============================] - 1s - loss: 0.4785 - acc: 0.8642 - val_loss: 0.2507 - val_acc: 0.9255
Epoch 2/200
60000/60000 [==============================] - 1s - loss: 0.2245 - acc: 0.9354 - val_loss: 0.1930 - val_acc: 0.9436
.
.
.
60000/60000 [==============================] - 1s - loss: 4.8932e-04 - acc: 1.0000 - val_loss: 0.1241 - val_acc: 0.9774
<keras.callbacks.History at 0x7f3096adadd8>

As you can see, there is an increase in accuracy compared to the model we built in the first version.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.197.95