Testing the performance of the neural network

To test the accuracy of the network, we construct another dataset for testing. We evaluate this model with what we've learned so far and print the statistics. If the accuracy of the network is more than 97%, we stop there and save the model to use for the graphical user interface that we will study later on. Execute the following code:

if (mnistTest == null) {
mnistTest = new MnistDataSetIterator(MINI_BATCH_SIZE, false, SEED);
}

The cost function is being printed and if you observe it closely, it gradually decreases through the iterations. From time to time, we have a peak in the value of the cost function. This is a characteristic of the mini-batch gradient descent. The final output of the first epoch shows us that the model has 96% accuracy just for one epoch, which is great. This means the neural network is learning fast.

In most cases, it does not work like this and we need to tune our network for a long time before we obtain the output we want. Let's look at the output of the second epoch:

We obtain an accuracy of more than 97% in just two epochs.

Another aspect that we need to draw our attention to is how a simple model is achieving really great results. This is a part of the reason why deep learning is taking off. It is easy to obtain good results, and it is easy to work with.

As mentioned before, let's look at a case of disconverging by increasing the learning rate to 0.6:

private static final double LEARNING_RATE = 0.01;

/**
* https://en.wikipedia.org/wiki/Random_seed
*/
private static final int SEED = 123;
private static final int IMAGE_WIDTH = 28;
private static final int IMAGE_HEIGHT = 28;

If we now run the network, we will observe that the cost function will continue to increase with no signs of decreasing. The accuracy is also affected greatly. The cost function for one epoch almost stays the same, despite having 3,000 iterations. The final accuracy of the model is approximately 10%, which is a clear indication that this is not the best method.

Let's run the application for various digits and see how it works. Let's begin with the number 3:

The output is accurate. Run this for any of the numbers that lie between Zero to nine and check whether your model is working accurately. 

Also, keep in mind that the model is not perfect yet-we shall improve this with CNN architectures in the next chapter. They offer state-of-the-art techniques and high accuracy, and we should be able to achieve an accuracy of 99%.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.75.165