EscapingSaddle-PointFasterunder Interpolation-likeConditions

Neural Information Processing Systems 

One of the fundamental aspects of over-parametrized models is that they are capable of interpolating the training data. We show that, under interpolation-like assumptions satisfied by the stochastic gradients in an overparametrization setting, thefirst-order oracle complexityofPerturbed Stochastic Gradient Descent (PSGD) algorithm toreach an -local-minimizer,matches the corresponding deterministic rateof O(1/2).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found