AITopics | log 1

We study the implicit bias of momentum-based optimizers on homogeneous models. We first extend existing results on the implicit bias of steepest descent in homogeneous models to normalized steepest descent with an optional learning rate schedule. We then show that for smooth homogeneous models, momentum steepest descent algorithms like Muon (spectral norm), MomentumGD ($\ell_2$ norm), and Signum ($\ell_\infty$ norm) are approximate steepest descent trajectories under a decaying learning rate schedule, proving that these algorithms too have a bias towards KKT points of the corresponding margin maximization problem. We extend the analysis to Adam (without the stability constant), which maximizes the $\ell_\infty$ margin, and to Muon-Signum and Muon-Adam, which maximize a hybrid norm. Our experiments corroborate the theory and show that the identity of the margin maximized depends on the choice of optimizer. Overall, our results extend earlier lines of work on steepest descent in homogeneous models and momentum-based optimizers in linear models.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2602.1634

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

DisARM: AnAntitheticGradientEstimatorfor BinaryLatentVariables

Neural Information Processing SystemsFeb-19-2026, 07:25:17 GMT

Much of the recent progress has been achieved by taking advantage of continuous relaxations of the system, which are not always available or even possible.

artificial intelligence, estimator, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

7a43ed4e82d06a1e6b2e88518fb8c2b0-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 03:17:38 GMT

When n is independent ofδ our approach yields an algorithm whose sample complexityconvergesto 2n2 log 1δ asngrows.

algorithm, artificial intelligence, log 1, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
North America > United States > District of Columbia > Washington (0.05)
Asia > Middle East > Israel > Haifa District > Haifa (0.05)
(6 more...)

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

fe61e76998bbe3db53a6a48fa58207e9-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 20:12:49 GMT

artificial intelligence, machine learning, objective, (18 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Europe > Slovakia > Bratislava > Bratislava (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Entropy testing and its application to testing Bayesian networks

Neural Information Processing SystemsFeb-18-2026, 15:31:08 GMT

This paper studies the problem of entropy identity testing: given sample access to a distribution p and a fully described distribution q (both discrete distributions over a domain of size k), and the promise that either p = q or |H (p) H (q)| ε, where H () denotes the Shannon entropy, a tester needs to distinguish between the two cases with high probability.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology: