AITopics | overwhelming probability

Collaborating Authors

overwhelming probability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

High-dimensional Limit of SGD for Diagonal Linear Networks

Malaxechebarría, Begoña García, Paquette, Courtney, Fazel, Maryam, Drusvyatskiy, Dmitriy

arXiv.org Machine LearningMay-19-2026

Understanding the behavior of stochastic gradient methods is a central problem in modern machine learning. Recent work has highlighted diagonal linear networks as a simplified yet expressive setting for analyzing the optimization and generalization properties of neural models. In this work, we show that in the high-dimensional regime, stochastic gradient descent on diagonal linear networks is well-approximated by continuous dynamics governed by a stochastic differential equation (SDE), which explicitly decouples the drift from the gradient noise. We further derive a deterministic partial differential equation whose solution propagates the relevant state of the iterates and characterizes the time evolution of a broad class of observable statistics, including the risk, curvature, and other metrics for optimality. Finally, we show that, under a suitable parametrization, the stochastic dynamics are globally well posed and converge exponentially fast to zero risk with high probability, yielding a fully explicit non-asymptotic description of their long-time behavior. Numerical simulations corroborate our theoretical findings.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

2605.17177

Country: North America > United States > New York (0.27)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

e55653539c9a9aa096a1fc8ca77ff413-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 11:49:00 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Lebanon (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Intensity Profile Projection: A Framework for Continuous-Time Representation Learning for Dynamic Networks

Neural Information Processing SystemsFeb-11-2026, 08:45:03 GMT

Moreoever, we develop estimation theory providing tight control on the error of any estimated trajectory, indicating that the representations could even be used in quite noise-sensitive follow-on analyses.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
South America > Paraguay > Asunción > Asunción (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Bristol (0.04)

Industry: Education (1.00)

Technology:

Information Technology > Communications (0.93)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

43f55776896a2e33239c2954519f605e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 15:06:42 GMT

algorithm, log 2, poly, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(2 more...)

Add feedback

Exact Dynamics of Multi-class Stochastic Gradient Descent

Collins-Woodfin, Elizabeth, Seroussi, Inbar

arXiv.org Machine LearningOct-17-2025

We develop a framework for analyzing the training and learning rate dynamics on a variety of high- dimensional optimization problems trained using one-pass stochastic gradient descent (SGD) with data generated from multiple anisotropic classes. We give exact expressions for a large class of functions of the limiting dynamics, including the risk and the overlap with the true signal, in terms of a deterministic solution to a system of ODEs. We extend the existing theory of high-dimensional SGD dynamics to Gaussian-mixture data and a large (growing with the parameter size) number of classes. We then investigate in detail the effect of the anisotropic structure of the covariance of the data in the problems of binary logistic regression and least square loss. We study three cases: isotropic covariances, data covariance matrices with a large fraction of zero eigenvalues (denoted as the zero-one model), and covariance matrices with spectra following a power-law distribution. We show that there exists a structural phase transition. In particular, we demonstrate that, for the zero-one model and the power-law model with sufficiently large power, SGD tends to align more closely with values of the class mean that are projected onto the "clean directions" (i.e., directions of smaller variance). This is supported by both numerical simulations and analytical studies, which show the exact asymptotic behavior of the loss in the high-dimensional limit.

artificial intelligence, def, machine learning, (17 more...)

arXiv.org Machine Learning

2510.14074

Country:

North America > United States > Oregon > Lane County > Eugene (0.14)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.86)
Research Report > Experimental Study (0.54)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

e55653539c9a9aa096a1fc8ca77ff413-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 19:43:16 GMT

approximation, kernel, matrix, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Lebanon (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms

Neural Information Processing SystemsOct-9-2025, 18:20:54 GMT

We give exact expressions for the risk and learning rate curves in terms of a deterministic solution to a system of ODEs.

adagrad-norm, algorithm, equation, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > United States (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

487667c56596138d36bbaa3bd8aac6df-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 14:58:54 GMT

matrix, representation, trajectory, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
South America > Paraguay > Asunción > Asunción (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Bristol (0.04)

Industry: Education (1.00)

Technology:

Information Technology > Communications (0.93)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Near-Optimal Private and Scalable k-Clustering

Neural Information Processing SystemsAug-14-2025, 12:07:20 GMT

This high demand has stimulated an important research effort to design algorithmic techniques enabling privacy-preserving algorithms.

algorithm, log 2, poly, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Better Rates for Private Linear Regression in the Proportional Regime via Aggressive Clipping

Bombari, Simone, Seroussi, Inbar, Mondelli, Marco

arXiv.org Machine LearningMay-23-2025

Differentially private (DP) linear regression has received significant attention in the recent theoretical literature, with several works aimed at obtaining improved error rates. A common approach is to set the clipping constant much larger than the expected norm of the per-sample gradients. While simplifying the analysis, this is however in sharp contrast with what empirical evidence suggests to optimize performance. Our work bridges this gap between theory and practice: we provide sharper rates for DP stochastic gradient descent (DP-SGD) by crucially operating in a regime where clipping happens frequently. Specifically, we consider the setting where the data is multivariate Gaussian, the number of training samples $n$ is proportional to the input dimension $d$, and the algorithm guarantees constant-order zero concentrated DP. Our method relies on establishing a deterministic equivalent for the trajectory of DP-SGD in terms of a family of ordinary differential equations (ODEs). As a consequence, the risk of DP-SGD is bounded between two ODEs, with upper and lower bounds matching for isotropic data. By studying these ODEs when $n / d$ is large enough, we demonstrate the optimality of aggressive clipping, and we uncover the benefits of decaying learning rate and private noise scheduling.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Machine Learning

2505.16329

Country: