In analyzing images, the first step is to convert colors into numerical values. Matplotlib provides APIs to read and show an image matrix of RGB values.
The following is a quick code example of reading an image into a NumPy array with plt.imread('image_path'), and we show it with plt.imshow(image_ndarray). Make sure that the Pillow package is installed so that more image types other than PNG can be handled:
import matplotlib.pyplot as plt
# Source image downloaded under CC0 license: Free for personal and commercial use. No attribution required.
# Source image address: https://pixabay.com/en/rose-pink-blossom-bloom-flowers-693155/
img = plt.imread('ch04.img/mpldev_ch04_rose.jpg')
plt.imshow(img)
Here is the original image displayed with the preceding code:
After showing the original image, we will try to work with transforming the image by changing the color values in the image matrix. We will create a high-contrast image by setting the RGB values to either 0 or 255 (max) at the threshold of 160. Here is how to do so:
# create a copy because the image object from `plt.imread()` is read-only
imgcopy = img.copy()
imgcopy[img<160] = 0
imgcopy[img>=160] = 255
plt.imshow(imgcopy)
plt.show()
This is the result of the transformed image. By artificially increasing the contrast, we have created a pop art image!
To demonstrate a more practical use for the image processing feature of Matplotlib, we will demonstrate MNIST. MNIST is a famous dataset of handwritten digits. It is often used in tutorials of machine learning algorithms. Here, we will not go into details of machine learning but rather will try to recreate the scenario where we visually inspect the dataset during the exploratory data analysis stage.
We can download the entire MNIST dataset from the official site at http://yann.lecun.com/exdb/mnist/. To ease our discussion and introduce useful package for Python machine learning, we load the data from Keras, which is a high-level API that facilitates neural network implementation. The MNIST dataset from the Keras package contains 70,000 images, arranged in tuples of coordinates and corresponding labels to facilitate model training and testing when building neural networks.
Let's first import the package:
from keras.datasets import mnist
The data is loaded only when load_data() is called. Because Keras is intended for training, the data is returned in tuples of training and testing datasets, each containing the actual image color values and labels, named X and y by convention here:
(X_train,y_train),(X_test,y_test) = mnist.load_data()
When initially called, load_data() may take some time to download the MNIST dataset from the online database.
We can inspect the dimensions of the data as follows:
for d in X_train, y_train, X_test, y_test:
print(d.shape)
Here is the output:
(60000, 28, 28) (60000,) (10000, 28, 28) (10000,)
Finally let's take one of the images in the X_train set and plot it in black and white with plt.imshow():
plt.imshow(X_train[123], cmap='gray_r')
From the following figure, we can easily read seven with our bare eyes. In the case of solving real image recognition problems, we may sample some mis-called images and consider strategies to optimize our training algorithms: