Similar to what we did at the beginning of this chapter, we will draw the digit images out. This time, we pick out the mislabeled ones because they are the ones we're concerned about. We will again pick 10 images and put them in a grid of subplots. We write the true label in green at the bottom as xlabel and the false label predicted in red as the title at the top for each image in a subplot:
import matplotlib.pyplot as plt
nrows, ncols = 2, 5
plt.figure(figsize=(6,3))
for i in range(ncols * nrows):
j = mislabeled_indices[i]
ax = plt.subplot(nrows, ncols, i + 1)
ax.imshow(X_test[j].reshape(8,8),cmap='gray_r')
plt.xticks([])
plt.yticks([])
plt.title(y_test1[j],color='red')
plt.xlabel(y_test[j],color='green')
plt.show()
Let's see how the images look. Does the handwriting look more like the true or falsely predicted label to you?
We can observe that, for some images, it is quite difficult to identify the true label at the 8x8 resolution even with the naked eye, such as the number 4 in the middle of the bottom row. However, the leftmost number 4 on the same row should be legible enough for humans to recognize. From here, we can estimate the maximum possible improvement in accuracy by additional training and optimizing the model. This will guide our decision on whether it is worth while to expend further effort to improve our model, or what kind of training data to get or generate next to obtain better results.
Meanwhile, notice that the training and testing datasets generally contain samples from different distributions. It is left for an exercise for you to repeat the process by downloading the actual training dataset from UCI ML, using the larger MNIST dataset (by downloading it via Keras, or even scraping or creating your own).