Export Reviews, Discussions, Author Feedback and Meta-Reviews
–Neural Information Processing Systems
Summary: This paper analyzes the stochastic version of normalized version of normalized gradient descent (NGD), which is the first effort to explore the efficacy and property of stochastic normalized gradient descent (SNGD). In order to verify the benefits of NGD in training non-convex optimization problems, this paper introduces a new property, local-quasi-convexity, to prove its convergence to a global minimum. Particularly, they prove that NGD finds an \epsilon-optimal minimum for local quasi convex functions within O(1/ \epsilon 2) iterations. In addition, this paper introduces a new setup: stochastic optimization of locally-quasi convex functions, in which the gradient is estimated using a minibatch of examples. Empirically, this paper reports experimental results by training deep neural networks by comparing with the-state-of-the-arts methods, minibatch SGD and Nesterov's accelerated gradient method.
Neural Information Processing Systems
Feb-7-2025, 15:26:57 GMT