AITopics | nag

Collaborating Authors

nag

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Algorithmic Instabilities ofAccelerated Gradient Descent

Neural Information Processing SystemsApr-24-2026, 14:16:25 GMT

We disprove this conjecture and show, for two notions of algorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentially fast with the number of gradient steps.

artificial intelligence, machine learning, stability, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks

Neural Information Processing SystemsFeb-11-2026, 11:17:50 GMT

We study the convergence rate of first-order methods for rectangular matrix factorization, which is a canonical nonconvex optimization problem.

artificial intelligence, initialization, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Africa > Senegal > Kolda Region > Kolda (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

c4b108f53550f1d5967305a9a8140ddd-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 05:43:32 GMT

Here we study structure-preserving discretizations for a certain class of dissipative (conformal) Hamiltonian systems, allowing us to analyze the symplectic structure of both Nesterov and heavy ball, besides providing several new insights into these methods. Moreover, we propose a new algorithm based on a dissipative relativistic system that normalizes the momentum and may result in more stable/faster optimization.

artificial intelligence, integrator, optimization, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Russia (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

c4b108f53550f1d5967305a9a8140ddd-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 05:26:40 GMT

algorithm, optimization, supplement, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Software > Programming Languages (0.35)
Information Technology > Artificial Intelligence (0.30)

Add feedback

b096577e264d1ebd6b41041f392eec23-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 20:48:21 GMT

g-tm, lemma 1, reviewer, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

Nesterovaccelerationdespiteverynoisygradients

Neural Information Processing SystemsFeb-9-2026, 14:26:53 GMT

In this setting, minibatch gradient estimates are exact (namely, exactly 0) on the set of global minimizers since data can be interpolated exactly.

agnes, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

AlgorithmicInstabilities ofAcceleratedGradientDescent

Neural Information Processing SystemsFeb-7-2026, 10:04:10 GMT

We disprove this conjecture and show,fortwonotions ofalgorithmic stability (including uniform stability), that the stability of Nesterov's accelerated method in fact deteriorates exponentiallyfast withthenumberofgradientsteps.

artificial intelligence, machine learning, stability, (18 more...)

Neural Information Processing Systems

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Robust Gradient Descent via Heavy-Ball Momentum with Predictive Extrapolation

Ali, Sarwan

arXiv.org Artificial IntelligenceDec-12-2025

Accelerated gradient methods like Nesterov's Accelerated Gradient (NAG) achieve faster convergence on well-conditioned problems but often diverge on ill-conditioned or non-convex landscapes due to aggressive momentum accumulation. We propose Heavy-Ball Synthetic Gradient Extrapolation (HB-SGE), a robust first-order method that combines heavy-ball momentum with predictive gradient extrapolation. Unlike classical momentum methods that accumulate historical gradients, HB-SGE estimates future gradient directions using local Taylor approximations, providing adaptive acceleration while maintaining stability. We prove convergence guarantees for strongly convex functions and demonstrate empirically that HB-SGE prevents divergence on problems where NAG and standard momentum fail. On ill-conditioned quadratics (condition number κ = 50), HB-SGE converges in 119 iterations while both SGD and NAG diverge. On the non-convex Rosen-brock function, HB-SGE achieves convergence in 2,718 iterations where classical momentum methods diverge within 10 steps. While NAG remains faster on well-conditioned problems, HB-SGE provides a robust alternative with speedup over SGD across diverse landscapes, requiring only O(d) memory overhead and the same hy-perparameters as standard momentum.

artificial intelligence, hb-sge, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2512.10033

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports > Tennis (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.41)

Add feedback

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance

Zhong, Jincheng, Jiang, Boyuan, Tao, Xin, Wan, Pengfei, Gai, Kun, Long, Mingsheng

arXiv.org Artificial IntelligenceOct-15-2025

Existing denoising generative models rely on solving discretized reverse-time SDEs or ODEs. In this paper, we identify a long-overlooked yet pervasive issue in this family of models: a misalignment between the pre-defined noise level and the actual noise level encoded in intermediate states during sampling. We refer to this misalignment as noise shift. Through empirical analysis, we demonstrate that noise shift is widespread in modern diffusion models and exhibits a systematic bias, leading to sub-optimal generation due to both out-of-distribution generalization and inaccurate denoising updates. To address this problem, we propose Noise Awareness Guidance (NAG), a simple yet effective correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule. We further introduce a classifier-free variant of NAG, which jointly trains a noise-conditional and a noise-unconditional model via noise-condition dropout, thereby eliminating the need for external classifiers. Extensive experiments, including ImageNet generation and various supervised fine-tuning tasks, show that NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.12497

Genre: Research Report (0.82)

Technology: