Dropout

Another technique for regularization that we will look at is something called Dropout. Introduced in 2012 by G. E. Hinton, dropout is a simple method of regularization that gives very good results. The idea behind dropout is that at each training iteration, all the neurons in a layer may, with random probability (usually 50%), be turned on and off.

This turning on and off forces the network to learn the same concepts as usual, but via multiple different paths. After training, all neurons are kept on, and these paths will behave like an ensemble of multiple networks that will be used to average the final result, thus improving generalization. It forces the weight to be distributed across the whole network and keeps the low somewhat as regularization does.

Another way to understand the concept is by having a team where multiple people share similar knowledge; each one of them will have their own ideas on how to solve a particular problem, and an ensemble of those experiences provide a better way to solve the issue:

In the following graph, we display the model test error. It's clear that with dropout, the error on the test set decreases. Remember that like with all regularization, using dropout will make your training loss increase compared to using no regularization, but at the end of the day, we're only interested in the model test error being lower (generalization):

In general, dropout is applied only to the fully connected layers, but it can also be applied to convolutional/pooling layers; if this is done, you would use a lower p (probability of dropping out), something closer to 0.2. Also you place the dropout layer after the activation layer.

To use dropout in your TensorFlow model, we call tf.layers.dropout() on the input layer we wish to apply dropout to. We must also specify the dropout rate we want to use and, importantly, a Boolean letting TensorFlow know if our model is training or not. Remember that when we are using our model at test time, we turn dropout off, and this Boolean will do that for us. Hence, our code with dropout will look as follows:

# Fully connected layer (in tf contrib folder for now) 

fc1 = tf.layers.dense(fc1, 1024) 

# Apply Dropout (if is_training is False, dropout is not applied) 

fc1 = tf.layers.dropout(fc1, rate=dropout, training=is_training)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.254.61