Autoencoders

The first generative model we will look at is the autoencoder model. An autoencoder is a simple neural network that is composed of two parts: an encoder and a decoder. The idea is that the encoder part will compress your input into a smaller dimension. From this smaller dimension, it then tries to reconstruct the input using the decoder part of the model. This smaller dimension is often called by many names such as latent space, hidden space, an embedding, or a coding.

If the autoencoder is able to reproduce its input, then, in theory, this latent space should encode all the important information needed to represent the original data, but with the advantage of being a smaller dimension than the input. The encoder can be thought of as a way of compressing the input data while the decoder is the way to uncompress it. We can see what a simple autoencoder looks like in the following illustration. Our latent space or coding is the part in the middle labeled z.

Traditionally, in an autoencoder, the layers that make up the network are just fully connected layers, but autoencoders can be extended to images as well by using convolutional layers. Just as earlier, the encoder will compress the input image to a smaller representation, and the decoder will try its best to recover the information. The difference is that the encoder is now a CNN that compresses the data into a feature vector, rather than an ANN with fully connected layers, and the decoder will use transposed convolution layers to recreate the image from the encoding.

An example of an autoencoder working on images is given here. For the decoder part of

With any autoencoder, the loss function will guide both the encoder and decoder to reconstruct the input. A common loss to use is the L2 loss between the output of the autoencoder and the input of the network. One question that we should ask ourselves now is, "Is it a good idea to use L2 loss to compare images?". If you take the following images, even though they look quite different, they do in fact all have the same L2 distance from each other:

This shows that things like L2 loss can't always be relied upon when you are using it to compare images, so you should keep this in mind when working with it.

Table of Contents for Autoencoders

Create new playlist

Sign In

Sign Up

Table of Contents for
Autoencoders