Reviews: How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD
–Neural Information Processing Systems
This work studies convergence rates of the gradients for convex composite objectives by combining Nesterov's tricks used for gradient descent with SGD. The authors provide three approaches which differ from each other only slightly and they provide the convergence rates for all the proposed approaches. My comments on this work are as follow: 1. It is indeed important to study convergence rates of gradients especially for non-convex problems. The authors motivate the readers by mentioning this but they assume convexity in their problem set-up.
Neural Information Processing Systems
Oct-7-2024, 21:00:02 GMT
- Technology: