Goto

Collaborating Authors

 initialization stability


Algorithmic Instabilities ofAccelerated Gradient Descent

Neural Information Processing Systems

We disprove this conjecture and show, for two notions of algorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentially fast with the number of gradient steps.


AlgorithmicInstabilities ofAcceleratedGradientDescent

Neural Information Processing Systems

We disprove this conjecture and show,fortwonotions ofalgorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentiallyfast withthenumberofgradientsteps.


The Instability of Accelerated Gradient Descent

arXiv.org Machine Learning

Algorithmic stability has emerged over the last two decades as a central tool in generalization analysis of learning algorithms. While the classical approach in generalization theory originating in the PAC learning framework appeal to uniform convergence arguments, more recent progress on stochastic convex optimization models, starting with the pioneering work of Bousquet and Elisseeff (2002) and Shalev-Shwartz et al. (2009), has relied on stability analysis for deriving tight generalization results for convex risk minimizing algorithms. Perhaps the most common form of algorithmic stability is the so called uniform stability (Bousquet and Elisseeff, 2002). Roughly, the uniform stability of a learning algorithm is the worst-case change in its output model, in terms of its loss on an arbitrary example, when replacing a single sample in the data set used for training. Bousquet and Elisseeff (2002) initially used uniform stability to argue about the generalization of empirical risk minimization with strongly convex losses.