MaSS: an Accelerated Stochastic Method for Over-parametrized Learning

Liu, Chaoyue, Belkin, Mikhail

arXiv.org Machine Learning 

Stochastic gradient based methods are dominant in optimization for most large-scale machine learning problems, due to the simplicity of computation and their compatibility with modern parallel hardware, such as GPU. In most cases these methods use over-parametrized models allowing for interpolation, i.e., perfect fitting of the training data. While we do not yet have a full understanding of why these solutions generalize (as indicated by a wealth of empirical evidence, e.g., [22, 2]) we are beginning to recognize their desirable properties for optimization, particularly in the SGD setting [11]. In this paper, we leverage the power of the interpolated setting to propose MaSS (Momentum-added Stochastic Solver), a stochastic momentum method for efficient training of over-parametrized models. See pseudo code in Appendix A. The algorithm keeps two variables (weights)w andu .

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found