Preprocessing the dataset

Before we can pass the dataset to the classifier, we need to preprocess it following the best practices from Chapter 4, Representing Data and Engineering Features.

Specifically, we want to make sure that all example images have the same mean grayscale level:

In [5]: n_samples, n_features = X.shape[:2]
... X -= X.mean(axis=0)

We repeat this procedure for every image to make sure the feature values of every data point (that is, a row in X) are centered around zero:

In [6]: X -= X.mean(axis=1).reshape(n_samples, -1)

The preprocessed data can be visualized using the following code:

In [7]: for p, i in enumerate(idx_rand):
... plt.subplot(2, 4, p + 1)
... plt.imshow(X[i, :].reshape((64, 64)), cmap='gray')
... plt.axis('off')

This produces the following output:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.145.82