Linear Convergence with Condition Number Independent Access of Full Gradients

Mar-13-2024, 15:51:44 GMT–Neural Information Processing Systems

For smooth and strongly convex optimizations, the optimal iteration complexity of the gradient-based algorithm is O( κlog1/ǫ), where κ is the condition number. In the case that the optimization problem is ill-conditioned, we need to evaluate a large number of full gradients, which could be computationally expensive. In this paper, we propose to remove the dependence on the condition number by allowing the algorithm to access stochastic gradients of the objective function. To this end, we present a novel algorithm named Epoch Mixed Gradient Descent (EMGD) that is able to utilize two kinds of gradients. A distinctive step in EMGD is the mixed gradient descent, where we use a combination of the full and stochastic gradients to update the intermediate solution.

gradient, objective function, stochastic gradient, (15 more...)

Neural Information Processing Systems

Mar-13-2024, 15:51:44 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Michigan > Ingham County
    - Lansing (0.04)
    - East Lansing (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.05)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning > Statistical Learning
    - Gradient Descent (1.00)