Almost sure convergence rates of stochastic gradient methods under gradient domination

Weissmann, Simon, Klein, Sara, Azizian, Waïss, Döring, Leif

May-27-2024–arXiv.org Artificial Intelligence

First-order methods to minimize an objective function f have played a central role in the success of machine learning. This is accompanied by a growing interest in convergence statements particularly for stochastic gradient methods in different settings. To ensure convergence to the global optimum some kind of convexity assumption on the objective function is required. Especially in machine learning problems the standard (strong) convexity assumption is nearly never fulfilled. However, it is well known that achieving convergence towards global optima is still possible under a weaker assumption, namely under the gradient domination property, often referred to as Polyak-Lojasiewicz (PL)-inequality [45]. Also in reinforcement learning, multiple results have shown that the objective function for policy gradient methods, under specific parametrizations, fulfills a weak type of gradient domination and therefore provably achieve convergence towards the global optimum [11, 21, 33, 34]. Improving the understanding of rates and optimal step size choices for stochastic first order methods is of significant interest for the machine learning and reinforcement learning community.

artificial intelligence, gradient domination property, machine learning, (16 more...)

arXiv.org Artificial Intelligence

May-27-2024

arXiv.org PDF

Add feedback

Country:
- Europe > France (0.14)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found