How it works...

Weights were initialized using Xavier initialization, which ensures that our weights are neither too large nor too small; both cases impede the learning of the network. In Xavier initialization, the weights are assigned a value from a distribution with zero mean and a specific variance 2/(nin+nout), where nin and nout are respectively the numbers of inputs and output of the layer. To know more about Xavier initialization, you can refer to Glorot and Bengio's 2009 paper; details are given in the See also section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.229.161