Convolutional neural networks

Convolutional neural networks are a class of neural network that resolve the high-dimensionality problem we alluded to in the previous section, and, as a result, excel at image-classification tasks. It turns out that image pixels in a given image region are highly correlated—they tell us similar information about that specific image region. Accordingly, using convolutional neural networks, we can scan regions of an image and summarize that region in lower-dimensional space. As we'll see, these lower-dimensional representations, called feature maps, tell us many interesting things about the presence of all sorts of shapes—from the simplest lines, shadows, loops, and swirls, to very abstract, complex forms specific to our data, in our case, cat ears, cat faces, or tortillas—and do this in fewer dimensions than the original image.

After using convolutional neural networks to extract these lower-dimensional features from our images, we'll pass the output of the convolutional neural network into a network suitable for the classification or regression task we want to perform. In our case, when modeling the Zalando research dataset, the output of our convolutional neural network will be passed into a fully-connected neural network for multi-class classification.

But how does this work? There are several key components we'll discuss with respect to convolutional neural networks on grayscale images, and these are all important for building our understanding.

Table of Contents for Convolutional neural networks

Create new playlist

Sign In

Sign Up

Table of Contents for
Convolutional neural networks