Non-strongly-convex smooth stochastic approximation with convergence rateO(1/n)

Mar-13-2024, 17:51:20 GMT–Neural Information Processing Systems

We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for function values of O(1/ n) after n iterations. We consider and analyze two algorithms that achieve a rate of O(1/n) for classical supervised learning problems. For least-squares regression, we show that averaged stochastic gradient descent with constant step-size achieves the desired rate. For logistic regression, this is achieved by a simple novel stochastic gradient algorithm that (a) constructs successive local quadratic approximations of the loss functions, while (b) preserving the same running-time complexity as stochastic gradient descent. For these algorithms, we provide a non-asymptotic analysis of the generalization error (in expectation, and also in high probability for least-squares), and run extensive experiments showing that they often outperform existing approaches.

algorithm, log 10, regression, (14 more...)

Neural Information Processing Systems

Mar-13-2024, 17:51:20 GMT

Conferences PDF

Add feedback

Country:
- Europe
  - United Kingdom > England
    - West Sussex (0.04)
  - France > Île-de-France
    - Paris > Paris (0.04)

Genre:
- Research Report > New Finding (0.50)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)