AITopics | order method

Collaborating Authors

order method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficiently Escaping Saddle Points under Generalized Smoothness via Self-Bounding Regularity

Neural Information Processing SystemsJun-10-2026, 19:08:25 GMT

We study the optimization of non-convex functions that are not necessarily smooth (gradient and/or Hessian are Lipschitz) using first order methods. Smoothness is a restrictive assumption in machine learning in both theory and practice, motivating significant recent work on finding first order stationary points of functions satisfying generalizations of smoothness with first order methods. We develop a novel framework that lets us systematically study the convergence of a large class of first-order optimization algorithms (which we call decrease procedures) under generalizations of smoothness. We instantiate our framework to analyze the convergence of first order optimization algorithms to first and order stationary points under generalizations of smoothness. As a consequence, we establish the first convergence guarantees for first order methods to second order stationary points under generalizations of smoothness. We demonstrate that several canonical examples fall under our framework, and highlight practical implications.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization

Neural Information Processing SystemsMar-16-2026, 18:25:37 GMT

Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has been recognized that they may cycle, and there is no good understanding of their limit points when they do not. When they converge, do they converge to local min-max solutions? We characterize the limit points of two basic first order methods, namely Gradient Descent/Ascent (GDA) and Optimistic Gradient Descent Ascent (OGDA). We show that both dynamics avoid unstable critical points for almost all initializations. Moreover, for small step sizes and under mild assumptions, the set of OGDA-stable critical points is a superset of GDA-stable critical points, which is a superset of local min-max solutions (strict in some cases). The connecting thread is that the behavior of these dynamics can be studied from a dynamical systems perspective.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fast Efficient Hyperparameter Tuning for Policy Gradient Methods

Supratik Paul, Vitaly Kurin, Shimon Whiteson

Neural Information Processing SystemsFeb-12-2026, 14:56:41 GMT

Neural Information Processing Systems http://nips.cc/

hoof, hyperparameter, optimization, (11 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

60495b4e033e9f60b32a6607b587aadd-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 08:39:11 GMT

algorithm, condition number, order method, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

Efficiently avoiding saddle points with zero order methods: No gradients required

Emmanouil-Vasileios Vlatakis-Gkaragkounis, Lampros Flokas, Georgios Piliouras

Neural Information Processing SystemsFeb-11-2026, 12:36:43 GMT

Neural Information Processing Systems http://nips.cc/

gradient descent, order method, saddle point, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > Quebec > Montreal (0.05)
Europe > Spain > Canary Islands (0.04)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.33)

Add feedback

dea9ddb25cbf2352cf4dec30222a02a5-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 18:44:11 GMT

Thus our heuristic relaxation does give valid upper bounds of the exact Lipschitz constant.

artificial intelligence, lipschitz constant, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

A Computationally Efficient Sparsified Online Newton Method Devvrit

Neural Information Processing SystemsFeb-7-2026, 17:03:54 GMT

Sparsified Online Newton (SONew) method, a memory-efficient second-order algorithm that yields a sparsified yet effective preconditioner.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

Neural Information Processing SystemsDec-26-2025, 07:42:39 GMT

This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.

markovian noise, name change, order method, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.85)

Add feedback

First-order methods almost always avoid saddle points: The case of vanishing step-sizes

Neural Information Processing SystemsDec-25-2025, 07:12:44 GMT

In a series of papers [Lee et al 2016], [Panageas and Piliouras 2017], [Lee et al 2019], it was established that some of the most commonly used first order methods almost surely (under random initializations) and with step-size being small enough, avoid strict saddle points, as long as the objective function $f$ is $C^2$ and has Lipschitz gradient. The key observation was that first order methods can be studied from a dynamical systems perspective, in which instantiations of Center-Stable manifold theorem allow for a global analysis. The results of the aforementioned papers were limited to the case where the step-size $\alpha$ is constant, i.e., does not depend on time (and typically bounded from the inverse of the Lipschitz constant of the gradient of $f$). It remains an open question whether or not the results still hold when the step-size is time dependent and vanishes with time. In this paper, we resolve this question on the affirmative for gradient descent, mirror descent, manifold descent and proximal point. The main technical challenge is that the induced (from each first order method) dynamical system is time non-homogeneous and the stable manifold theorem is not applicable in its classic form. By exploiting the dynamical systems structure of the aforementioned first order methods, we are able to prove a stable manifold theorem that is applicable to time non-homogeneous dynamical systems and generalize the results in [Lee et al 2019] for time dependent step-sizes.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Filters

Collaborating Authors

order method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Efficiently Escaping Saddle Points under Generalized Smoothness via Self-Bounding Regularity

0b43289db08ed60edc6451cb2132e203-Paper-Conference.pdf

The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization

Fast Efficient Hyperparameter Tuning for Policy Gradient Methods

60495b4e033e9f60b32a6607b587aadd-AuthorFeedback.pdf

Efficiently avoiding saddle points with zero order methods: No gradients required

dea9ddb25cbf2352cf4dec30222a02a5-AuthorFeedback.pdf

A Computationally Efficient Sparsified Online Newton Method Devvrit

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

First-order methods almost always avoid saddle points: The case of vanishing step-sizes