AITopics | rmsprop

Collaborating Authors

rmsprop

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Some Optimizers are More Equal: Understanding the Role of Optimizers in Group Fairness

Neural Information Processing SystemsJun-14-2026, 23:22:26 GMT

We study whether and how the choice of optimization algorithm can impact group fairness in deep neural networks. Through stochastic differential equation analysis of optimization dynamics in an analytically tractable setup, we demonstrate that the choice of optimization algorithm indeed influences fairness outcomes, particularly under severe imbalance. Furthermore, we show that when comparing two categories of optimizers, adaptive methods and stochastic methods, RMSProp (from the adaptive category) has a higher likelihood of converging to fairer minima than SGD (from the stochastic category). Building on this insight, we derive two new theoretical guarantees showing that, under appropriate conditions, RMSProp exhibits fairer parameter updates and improved fairness in a single optimization step compared to SGD.

machine learning, natural language, rmsprop, (20 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

A Rod Flow Model for Adam at the Edge of Stability

Regis, Eric, Chewi, Sinho

arXiv.org Machine LearningMay-11-2026

Neural networks are trained by minimizing loss functions with gradient-based optimizers. Cohen et al. [2021] observed that full-batch gradient descent operates at the edge of stability (EoS): the largest eigenvalue of the Hessian, called the sharpness, first rises (a phase called progressive sharpening) and then hovers at the stability threshold 2/η where η is the learning rate. Cohen et al. [2022] extended this picture to momentum methods and adaptive gradient methods, showing that each optimizer exhibits its own edge of stability. Rather than hovering at 2/η, the relevant quantity--the preconditioned sharpness--hovers at a hyperparameter-dependent threshold that depends on the optimizer (Table 2). In practice, the dominant optimizer in machine learning is Adam [Kingma and Ba, 2015], which differs from gradient descent in two respects.

artificial intelligence, equation, machine learning, (15 more...)

arXiv.org Machine Learning

2605.06821

Country: North America (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

Resetting the Optimizer in Deep RL: An Empirical Study

Neural Information Processing SystemsApr-30-2026, 02:41:49 GMT

We focus on the task of approximating the optimal value function in deep reinforcement learning. This iterative process is comprised of solving a sequence of optimization problems where the loss function changes per iteration. The common approach to solving this sequence of problems is to employ modern variants of the stochastic gradient descent algorithm such as Adam. These optimizers maintain their own internal parameters such as estimates of the first-order and the secondorder moments of the gradient, and update them over time. Therefore, information obtained in previous iterations is used to solve the optimization problem in the current iteration. We demonstrate that this can contaminate the moment estimates because the optimization landscape can change arbitrarily from one iteration to the next one. To hedge against this negative effect, a simple idea is to reset the internal parameters of the optimizer when starting a new iteration. We empirically investigate this resetting idea by employing various optimizers in conjunction with the Rainbow algorithm. We demonstrate that this simple modification significantly improves the performance of deep RL on the Atari benchmark.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

eddea82ad2755b24c4e168c5fc2ebd40-Paper.pdf

Neural Information Processing SystemsApr-27-2026, 16:54:32 GMT

acprop, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

eddea82ad2755b24c4e168c5fc2ebd40-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 19:07:15 GMT

acprop, arxiv preprint arxiv, optimizer, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

76c073d8a82d9ddaf993300be03ac70f-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 10:23:37 GMT

We prove the O(t 1/2) rate of convergence for the squared norm of the gradient of Moreau envelope, which is the standard stationarity measure for this class of problems. It matches the known rates that adaptive algorithms enjoy for the specific case ofunconstrained smoothnonconvexstochastic optimization.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

Non-asymptoticAnalysisofBiasedAdaptive StochasticApproximation

Neural Information Processing SystemsFeb-8-2026, 13:36:34 GMT

While these algorithms havebeen extensively studied, both theoretically and practically, see, e.g., [10], many questions remain open. In particular, most results are based onthecase where theestimatord V isunbiased.

artificial intelligence, justification, machine learning, (17 more...)

Neural Information Processing Systems

Country: