AITopics | pl-inequality

Collaborating Authors

pl-inequality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Asymptotic Rates for (Stochastic) Gradient Descent under the Local PL-Condition: A Geometric Approach

Kassing, Sebastian, Kruse, Thomas

arXiv.org Machine LearningMay-15-2026

Stochastic gradient descent (SGD) has been studied extensively over the past decades due to its simplicity and broad applicability in machine learning. In this work, we analyze the local behavior of gradient descent and stochastic gradient descent for minimizing $C^2$-functions that satisfy the Polyak-Lojasiewicz (PL) inequality and under a multiplicative gradient noise model motivated by overparameterized neural networks. Using a geometric interpretation of the PL-condition, we prove a simple yet surprising fact: in this possibly non-convex setting, the asymptotic convergence rate of (S)GD matches the rate obtained for strongly convex quadratics.

artificial intelligence, machine learning, projn, (17 more...)

arXiv.org Machine Learning

2605.14663

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

8abfe8ac9ec214d68541fcb888c0b4c3-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 07:16:00 GMT

lemma 4, matrix, pl-inequality, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

8abfe8ac9ec214d68541fcb888c0b4c3-AuthorFeedback.pdf

Neural Information Processing SystemsAug-15-2025, 01:21:44 GMT

lemma 4, matrix, pl-inequality, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Polyak's Heavy Ball Method Achieves Accelerated Local Rate of Convergence under Polyak-Lojasiewicz Inequality

Kassing, Sebastian, Weissmann, Simon

arXiv.org Artificial IntelligenceOct-22-2024

In this work, we consider the convergence of Polyak's heavy ball method, both in continuous and discrete time, on a non-convex objective function. We recover the convergence rates derived in [Pol64] for strongly convex objective functions, assuming only validity of the Polyak-Lojasiewicz inequality. In continuous time our result holds for all initializations, whereas in the discrete time setting we conduct a local analysis around the global minima. Our results demonstrate that the heavy ball method does, in fact, accelerate on the class of objective functions satisfying the Polyak-Lojasiewicz inequality. This holds even in the discrete time setting, provided the method reaches a neighborhood of the global minima. Instead of the usually employed Lyapunov-type arguments, our approach leverages a new differential geometric perspective of the Polyak-Lojasiewicz inequality proposed in [RB24].

artificial intelligence, convergence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2410.16849

Country:

North America > United States (0.04)
Europe > Montenegro (0.04)
Europe > Germany > Saxony > Dresden (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.74)

Industry: Leisure & Entertainment > Sports > Tennis (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space

Yi, Mingyang, Wang, Bohan

arXiv.org Artificial IntelligenceJan-25-2024

Recently, optimization on the Riemannian manifold has provided new insights to the optimization community. In this regard, the manifold taken as the probability measure metric space equipped with the second-order Wasserstein distance is of particular interest, since optimization on it can be linked to practical sampling processes. In general, the oracle (continuous) optimization method on Wasserstein space is Riemannian gradient flow (i.e., Langevin dynamics when minimizing KL divergence). In this paper, we aim to enrich the continuous optimization methods in the Wasserstein space by extending the gradient flow into the stochastic gradient descent (SGD) flow and stochastic variance reduction gradient (SVRG) flow. The two flows on Euclidean space are standard stochastic optimization methods, while their Riemannian counterparts are not explored yet. By leveraging the structures in Wasserstein space, we construct a stochastic differential equation (SDE) to approximate the discrete dynamics of desired stochastic methods in the corresponded random vector space. Then, the flows of probability measures are naturally obtained by applying Fokker-Planck equation to such SDE. Furthermore, the convergence rates of the proposed Riemannian stochastic flows are proven, and they match the results in Euclidean space.

continuous-time riemannian sgd, convergence rate, riemannian sgd and svrg flow, (11 more...)

arXiv.org Artificial Intelligence

2401.1353

Country:

North America > United States > Massachusetts (0.04)
Europe > Denmark (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.54)

Add feedback

Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points

Baranwal, Mayank, Budhraja, Param, Raj, Vishal, Hota, Ashish R.

arXiv.org Machine LearningOct-22-2023

Gradient-based first-order convex optimization algorithms find widespread applicability in a variety of domains, including machine learning tasks. Motivated by the recent advances in fixed-time stability theory of continuous-time dynamical systems, we introduce a generalized framework for designing accelerated optimization algorithms with strongest convergence guarantees that further extend to a subclass of non-convex functions. In particular, we introduce the GenFlow algorithm and its momentum variant that provably converge to the optimal solution of objective functions satisfying the Polyak-{\L}ojasiewicz (PL) inequality in a fixed time. Moreover, for functions that admit non-degenerate saddle-points, we show that for the proposed GenFlow algorithm, the time required to evade these saddle-points is uniformly bounded for all initial conditions. Finally, for strongly convex-strongly concave minimax problems whose optimal solution is a saddle point, a similar scheme is shown to arrive at the optimal solution again in a fixed time. The superior convergence properties of our algorithm are validated experimentally on a variety of benchmark datasets.

artificial intelligence, genflow, machine learning, (18 more...)

arXiv.org Machine Learning

2212.03765

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Convergence rates for momentum stochastic gradient descent with noise of machine learning type

Gess, Benjamin, Kassing, Sebastian

arXiv.org Artificial IntelligenceFeb-7-2023

We consider the momentum stochastic gradient descent scheme (MSGD) and its continuous-in-time counterpart in the context of non-convex optimization. We show almost sure exponential convergence of the objective function value for target functions that are Lipschitz continuous and satisfy the Polyak-Lojasiewicz inequality on the relevant domain, and under assumptions on the stochastic noise that are motivated by overparameterized supervised learning applications.

artificial intelligence, exp, machine learning, (10 more...)

arXiv.org Artificial Intelligence

2302.0355

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
North America > United States (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback