Model Initialization

As we add more and more layers to our models, it becomes harder and harder to train them using backpropagation. The error values that are passed back through the model to update weights become smaller and smaller the deeper we go. This is known as the vanishing gradient problem.

As a result, an important thing to look at before we start training our models is what values we initialize our weights to. A bad initialization can make the model very slow to converge, or perhaps never converge at all.

Although we do not know exactly what values our weights will end up with after training, one might reasonably expect that about half of them will be positive values and half will be negative.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.166.31