Adaptive Gradient Methods with Local Guarantees

Lu, Zhou, Xia, Wenhan, Arora, Sanjeev, Hazan, Elad

Jan-25-2023–arXiv.org Artificial Intelligence

Adaptive gradient methods are the method of choice for optimization in machine learning and used to train the largest deep models. In this paper we study the problem of learning a local preconditioner, that can change as the data is changing along the optimization trajectory. We propose an adaptive gradient method that has provable adaptive regret guarantees vs. the best local preconditioner. To derive this guarantee, we prove a new adaptive regret bound in online learning that improves upon previous adaptive online learning methods. We demonstrate the robustness of our method in automatically choosing the optimal learning rate schedule for popular benchmarking tasks in vision and language domains. Without the need to manually tune a learning rate schedule, our method can, in a single run, achieve comparable and stable task accuracy as a fine-tuned optimizer.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Jan-25-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Industry:
- Education > Educational Setting > Online (0.87)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.47)
    - Statistical Learning (0.46)
  - Enterprise Applications > Human Resources
    - Learning Management (0.55)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found