Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Needell, Deanna, Ward, Rachel, Srebro, Nati

Dec-31-2014–Neural Information Processing Systems

We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for smooth and strongly convex objectives, reducing a quadratic dependence on the strong convexity to a linear dependence. Furthermore, we show how reweighting the sampling distribution (i.e. importance sampling) is necessary in order to further improve convergence, and obtain a linear dependence on average smoothness, dominating previous results, and more broadly discus how importance sampling for SGD can improve convergence also in other scenarios. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods.

artificial intelligence, dependence, machine learning, (18 more...)

Neural Information Processing Systems

Dec-31-2014

Conferences PDF

Add feedback

Country:
- North America > United States (0.68)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.87)

Duplicate Docs Excel Report

Title
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Similar Docs Excel Report more

Title	Similarity	Source
None found