MaSS: an Accelerated Stochastic Method for Over-parametrized Learning

Nov-5-2018–arXiv.org Machine Learning

Stochastic gradient based methods are dominant in optimization for most large-scale machine learning problems, due to the simplicity of computation and their compatibility with modern parallel hardware, such as GPU. In most cases these methods use over-parametrized models allowing for interpolation, i.e., perfect fitting of the training data. While we do not yet have a full understanding of why these solutions generalize (as indicated by a wealth of empirical evidence, e.g., [22, 2]) we are beginning to recognize their desirable properties for optimization, particularly in the SGD setting [11]. In this paper, we leverage the power of the interpolated setting to propose MaSS (Momentum-added Stochastic Solver), a stochastic momentum method for efficient training of over-parametrized models. See pseudo code in Appendix A. The algorithm keeps two variables (weights)w andu .

artificial intelligence, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

Nov-5-2018

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.47)
  - Statistical Learning > Gradient Descent (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found