Adam

Adam (the contraction of Adaptive Moment Estimation) is an algorithm proposed by Kingma and Ba (in Adam: A Method for Stochastic Optimization, Kingma D. P., Ba J., arXiv:1412.6980 [cs.LG]) to further improve the performance of RMSProp. The algorithm determines an adaptive learning rate by computing the exponentially weighted averages of both the gradient and its square for every parameter:

In the aforementioned paper, the authors suggest to unbias the two estimations (which concern the first and second moment) by dividing them by 1 - μi, so the new moving averages become as follows:

The weight update rule for Adam is as follows:

Analyzing the previous expression, it is possible to understand why this algorithm is often called RMSProp with momentum. In fact, the term g(•) acts just like the standard momentum, computing the moving average of the gradient for each parameter (with all the advantages of this procedure), while the denominator acts as an adaptive term with the same exact semantics of RMSProp. For this reason, Adam is very often one of the most widely employed algorithms, even if, in many complex tasks, its performances are comparable to a standard RMSProp. The choice must be made considering the extra complexity due to the presence of two forgetting factors. In general, the default values (0.9) are acceptable, but sometimes it's better to perform an analysis of several scenarios before deciding on a specific configuration. Another important element to remember is that all momentum based methods can lead to instabilities (oscillations) when training some deep architectures. That's why RMSProp is very diffused in almost any research paper; however, don't consider this statement as a limitation, because Adam has shown outstanding performances in many tasks. It's helpful to remember that, whenever the training process seems unstable also with low learning rates, it's preferable to employ methods that are not based on momentum (the inertial term, in fact, can slow down the fast modifications necessary to avoid oscillations).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.18.198