Denoising autoencoders

Autoencoders can be used to determine under-complete representations of a dataset; however, Bengio et al. (in P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol's book Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion, from the Journal of Machine Learning Research 11/2010) proposed to use them not to learn the exact representation of a sample in order to rebuild it from a low-dimensional code, but rather to denoise input samples. This is not a brand new idea, because, for example, Hopfield networks (proposed a few decades ago) had the same purpose, but its limitations in terms of capacity led researchers to look for different methods. Nowadays, deep autoencoders can easily manage high-dimensional data (such as images) with a consequent space requirement, that's why many people are now reconsidering the idea of teaching a network how to rebuild a sample image starting from a corrupted one.

Formally, there are not many differences between denoising autoencoders and standard autoencoders. However, in this case, the encoder must work with noisy samples:

The decoder's cost function remains the same. If the noise is sampled for each batch, repeating the process for a sufficiently large number of iterations allows the autoencoder to learn how to rebuild the original image when some fragments are missing or corrupted. To achieve this goal, the authors suggested different possible kinds of noise. The most common choice is to sample Gaussian noise, which has some helpful features and is coherent with many real noisy processes:

Another possibility is to employ an input dropout layer, zeroing some random elements:

This choice is clearly more drastic, and the rate must be properly tuned. A very large number of dropped pixels can irreversibly delete many pieces of information and the reconstruction can become more difficult and rigid (our purpose is to extend the autoencoder's ability to other samples drawn from the same distribution). Alternatively, it's possible to mix up Gaussian noise and the dropout's, switching between them with a fixed probability. Clearly, the models must be more complex than standard autoencoders because now they have to cope with missing information; the same concept applies to the code length: very under-complete code wouldn't be able to provide all the elements needed to reconstruct the original image in the most accurate way. I suggest testing all the possibilities, in particular when the noise is constrained by external conditions (for example, old photos or messages transmitted through channels affected by precise noise processes). If the model must also be employed for never-before-seen samples, it's extremely important to select samples that represent the true distribution, using data augmentation techniques (limited to operations compatible with the specific problem) whenever the number of elements is not enough to reach the desired level of accuracy.

Table of Contents for Denoising autoencoders

Create new playlist

Sign In

Sign Up

Table of Contents for
Denoising autoencoders