Stochastic optimization and sparse statistical recovery: An optimal algorithm for high dimensions

Agarwal, Alekh, Negahban, Sahand, Wainwright, Martin J.

Jul-18-2012–arXiv.org Machine Learning

Stochastic optimization algorithms have many desirable features for large-scale machine learning, and accordingly have been the focus of renewed and intensive study in the last several years (e.g., see the papers [26, 4, 10, 30] and references therein). The empirical efficiency of these methods is backed with strong theoretical guarantees, providing sharp bounds on their convergence rates. These convergence rates are known to depend on the structure of the underlying objective function, with faster rates being possible for objective functions that are smooth and/or (strongly) convex, or optima that have desirable features such as sparsity. More precisely, for an objective function that is strongly convex, stochastic gradient descent enjoys a convergence rate ranging from O(1/T), when features vectors are extremely sparse, to O(d/T) when feature vectors are dense [11, 19, 12]. Such results are of significant interest, because the strong convexity condition is satisfied for many common machine learning problems, including boosting, least squares regression, support vector machines and generalized linear models, among other examples. A complementary type of condition is that of sparsity, either exact or approximate, in the optimal solution.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

Jul-18-2012

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts (0.28)
  - California (0.28)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning
  - Gradient Descent (0.54)
  - Support Vector Machines (0.54)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found