On the Influence of Momentum Acceleration on Online Learning

Oct-12-2016–arXiv.org Machine Learning

The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the standard stochastic gradient method with a re-scaled (larger) step-size value. The size of the re-scaling is determined by the value of the momentum parameter. The equivalence result is established for all time instants and not only in steady-state. The analysis is carried out for general strongly convex and smooth risk functions, and is not limited to quadratic risks. One notable conclusion is that the well-known bene ts of momentum constructions for deterministic optimization problems do not necessarily carry over to the adaptive online setting when small constant step-sizes are used to enable continuous adaptation and learn- ing in the presence of persistent gradient noise. From simulations, the equivalence between momentum and standard stochastic gradient methods is also observed for non-differentiable and non-convex problems.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

Oct-12-2016

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America
  - Canada (0.67)
  - United States > California
    - Los Angeles County > Los Angeles (0.27)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Education > Educational Setting > Online (0.65)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Statistical Learning > Gradient Descent (0.77)
    - Neural Networks > Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found