Implementing Principal Component Analysis (PCA) in OpenCV

One of the most common dimensionality reduction techniques is called PCA.

Similar to the 2D and 3D examples shown earlier, we can think of an image as a point in a high-dimensional space. If we flatten a 2D grayscale image of height m and width n by stacking all of the columns, we get a (feature) vector of length m x n x 1. The value of the i^th element in this vector is the grayscale value of the i^th pixel in the image. Now, imagine we would like to represent every possible 2D grayscale image with these exact dimensions. How many images would that give?

Since grayscale pixels usually take values between 0 and 255, there are a total of 256 raised to the power of m x n images. Chances are that this number is huge! So, naturally, we ask ourselves whether there could be a smaller, more compact representation (using fewer than m n features) that describes all of these images equally well. After all, we have just seen that grayscale values are not the most informative measures of content. So, maybe there's a better way of representing all possible images than simply looking at all of their grayscale values.

This is where PCA comes in. Consider a dataset from which we extract exactly two features. These features could be the grayscale values of pixels at two positions— x and y, but they could also be more complex than that. If we plot the dataset along these two feature axes, the data might lie within some multivariate Gaussian:

In [1]: import numpy as np
...     mean = [20, 20]
...     cov = [[12, 8], [8, 18]]
...     np.random.seed(42)
...     x, y = np.random.multivariate_normal(mean, cov, 1000).T

Here, T refers to transpose. We can plot this data using matplotlib:

In [2]: import matplotlib.pyplot as plt
...     plt.style.use('ggplot')
...     %matplotlib inline
In [3]: plt.figure(figsize=(40, 40))
...     plt.plot(x, y, 'o', zorder=1)
...     plt.axis([0, 40, 0, 40])
...     plt.xlabel('feature 1')
...     plt.ylabel('feature 2')
Out[3]: <matplotlib.text.Text at 0x125e0c9af60>

This will produce the following plot:

What PCA does is rotate all of the data points until the data lies aligned with the two axes that explain most of the spread of the data. Let's have a look at what that means.

In OpenCV, performing PCA is as simple as calling cv2.PCACompute. However, first, we have to stack the feature vectors, x and y, into a single feature matrix, X:

In [4]: X = np.vstack((x, y)).T

Then, we can compute PCA on the feature matrix, X. We also specify an empty array, np.array([]), for the mask argument, which tells OpenCV to use all data points in the feature matrix:

In [5]: import cv2
...     mu, eig = cv2.PCACompute(X, np.array([]))
...     eig
Out[5]: array([[ 0.57128392, 0.82075251],
               [ 0.82075251, -0.57128392]])

The function returns two values: the mean value subtracted before the projection (mean) and the eigenvectors of the covariation matrix (eig). These eigenvectors point to the direction PCA considers the most informative. If we plot them on top of our data using matplotlib, we find that they are aligned with the spread of the data:

In [6]: plt.figure(figsize=(10, 6))
...     plt.plot(x, y, 'o', zorder=1)
...     plt.quiver(mean[0], mean[1], eig[:, 0], eig[:, 1], zorder=3, scale=0.2, units='xy')

We also add some text labels to the eigenvectors:

...     plt.text(mean[0] + 5 * eig[0, 0], mean[1] + 5 * eig[0, 1],
        'u1', zorder=5, fontsize=16, bbox=dict(facecolor='white',
        alpha=0.6))
...     plt.text(mean[0] + 7 * eig[1, 0], mean[1] + 4 * eig[1, 1],
        'u2', zorder=5, fontsize=16, bbox=dict(facecolor='white',
        alpha=0.6))
...     plt.axis([0, 40, 0, 40])
...     plt.xlabel('feature 1')
...     plt.ylabel('feature 2')
Out[6]: <matplotlib.text.Text at 0x1f3499f5860>

This will produce the following diagram:

Interestingly, the first eigenvector (labeled u1 in the preceding diagram) points in the direction where the spread of the data is maximal. This is called the first principal component of the data. The second principal component is u2 and indicates the axis along the second-most variation in the data that can be observed.

Hence, what PCA is telling us is that our predetermined x and y axes were not really that meaningful to describe the data we had chosen. Because the spread of the chosen data is angled at roughly 45 degrees, it would make more sense to chose u1 and u2 as axes instead of x and y.

To prove this point, we can rotate the data by using cv2.PCAProject:

In [7]: X2 = cv2.PCAProject(X, mu, eig)

By doing this, the data should be rotated so that the two axes of maximal spread are aligned with the x and y axes. We can convince ourselves of this fact by plotting the new data matrix, X2:

In [8]: plt.plot(X2[:, 0], X2[:, 1], 'o')
...     plt.xlabel('first principal component')
...     plt.ylabel('second principal component')
...     plt.axis([-20, 20, -10, 10])

This leads to the following diagram:

In this diagram, we can see that the data is spread out the most along the x axis. Hence, we are convinced that the projection was successful.

Table of Contents for Implementing Principal Component Analysis (PCA) in OpenCV

Create new playlist

Sign In

Sign Up

Table of Contents for
Implementing Principal Component Analysis (PCA) in OpenCV