Reviews: Adaptive Methods for Nonconvex Optimization

Oct-7-2024, 18:49:15 GMT–Neural Information Processing Systems

Bounds are given for the expected gradient of an ergodic average of the iterates produced by the algorithms applied to an L-smooth function, and these bounds converge to zero with time. The authors give several numerical results showing that their algorithm has state-of-the-art performance for different problems. In addition, they achieve this performance with little tuning, unlike in the classical SGD. A motivation behind their work is a paper [27] that shows that a recent adaptive algorithm, ADAM, can fail to converge even for simple convex problems, when the batch size is kept fix.

adaptive method, algorithm, nonconvex optimization, (8 more...)

Neural Information Processing Systems

Oct-7-2024, 18:49:15 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.37)