I've previously mentioned Soumith Chintala's GAN hacks Git (https://github.com/soumith/ganhacks), which is an excellent place to start when you're trying to make your GAN stable. Now that we've talked about how difficult it can be to train a stable GAN, let's talk about some of the safe choices that will likely help you succeed that you can find there. While there are quite a few hacks out there, here are my top recommendations that haven't been covered already in the chapter:
- Batch norm: When using batch normalization, construct different minibatches for both real and fake data and make the updates separately.
- Leaky ReLU: Leaky ReLU is a variation of the ReLU activation function. Recall the the ReLU function is .
Leaky ReLU, however, is formulated as:
Leaky ReLU allows very small, non-zero gradients when the unit isn't active. This combats vanishing gradients, which are always a problem when we stack many layers on top of each other like we are in the combination of the discriminator and generator.
- Use dropout in the generator: This will provide noise and protect from mode collapse.
- Use soft labels: Use labels between 0.7 and 1 for real examples and between 0 and 0.3 for fake examples. This noise helps keep information flowing from the discriminator to the generator.
There are quite a few other GAN hacks available that we cover elsewhere in this chapter; however, I consider these few hacks to be the most important when implementing a successful GAN.