AITopics | lemma 7

Collaborating Authors

lemma 7

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uniform-in-Time Weak Propagation-of-Chaos in Shallow Neural Networks

Glasgow, Margalit, Bruna, Joan

arXiv.org Machine LearningMay-22-2026

We consider one-hidden layer neural networks trained in the feature-learning regime using gradient descent, and relate the output of the finite-width network $f_{\hatρ_t^m}$ to its infinite-width counterpart $f_{ρ_t^{MF}}$, which evolves in the mean-field dynamics. While constant-time horizon bounds for $\|f_{ρ_t^{MF}} - f_{\hatρ_t^m}\|$ may be obtained via standard Grönwall estimates, the long-time behavior of the fluctuation is a more delicate matter. Uniform-in-time bounds often rely on (local) strong convexity in the landscape or Logarithmic Sobolev inequalities present in noisy gradient dynamics. In this work, we establish non-asymptotic weak propagation-of-chaos that holds uniformly in time, obtained by exploiting instead the convergence rate of the mean-field deterministic Wasserstein-gradient-flow dynamics. Specifically, denoting by $L_t$ the mean-field excess MSE loss at time $t$ and $m$ the number of neurons, under standard regularity assumptions and the condition $\int_0^\infty L_t^{1/2} dt =O(\log d)$, we obtain the uniform in time bound $\|f_{ρ_t^{MF}}- f_{\hatρ_t^m}\|^2 \lesssim \text{poly}(d) m^{-\min(1,c/6)}$ whenever $L_t \lesssim t^{-c}$. Our result holds in a noiseless setting and does not make any assumptions on the geometry of the landscape near the optimum, and extends seamlessly to other forms of discretization, including finite number of samples and time discretization. A key takeaway of our result is that whenever the convergence rate of the mean-field, population-loss dynamics is faster than $t^{-2}$, we can attain a loss of $ε$ with only $\text{poly}(d/ε)$ neurons, training samples, and GD steps.

artificial intelligence, machine learning, mft, (18 more...)

arXiv.org Machine Learning

2605.2201

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

On efficient robust regression with subquadratic samples

Adil, Deeksha, Błasiok, Jarosław, Chen, Hongjie, Sridharan, Deepak Narayanan

arXiv.org Machine LearningMay-19-2026

We revisit the problem of robust linear regression under Gaussian covariates with an unknown covariance matrix of condition number $κ$. For this fundamental problem, significant gaps remain in our understanding of the trade-offs among sample complexity, condition number, runtime, and prediction error for efficient algorithms. Our first result is a near-linear-time algorithm that uses $\widetilde{O}(d/ε^4)$ samples, where $d$ is the dimension and $ε$ is the corruption rate, and achieves prediction error $O(\sqrt{εκ})$ under the condition $εκ\lesssim 1$, improving over all prior works. We complement this result with a Statistical Query (SQ) lower bound showing that efficient SQ algorithms achieving error $o(\sqrt{εκ})$ when $εκ\lesssim 1$ require queries that take $Ω(d^2)$ samples to simulate. Finally, we prove a low-degree polynomial lower bound that gives fine-grained evidence that, without assumptions such as $εκ\lesssim 1$, efficient algorithms may require $\tildeΩ\left(\min\{dε^{2}κ^{2},\ ε^{2}d^{2}\}\right)$ samples to significantly outperform the trivial estimator that always guesses $0$.

artificial intelligence, machine learning, survey article, (20 more...)

arXiv.org Machine Learning

2605.18042

Country: North America > United States (0.27)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Supplementary Material for: An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap

Neural Information Processing SystemsApr-25-2026, 20:32:50 GMT

We first verify the statement for the terminal state f. Observe that at the terminal state f, regardless of the action taken, the next state is always f and the reward is always 0. Hence Q h(f,) = V h(f) = 0 for all h [H]. Thus Q h(f,) = hφ(f,),v(a)i= 0. We now verify realizability for other states via induction on h = H,H 1,,1. Next, note that h, (2) follows from (1). In other words, (1) implies that a is always the optimal action.

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

33d6548e48d4318ceb0e3916a79afc84-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 10:29:14 GMT

artificial intelligence, machine learning, probability, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

1c71cd4032da425409d8ada8727bad42-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 23:09:19 GMT

We can see that the error for the first term is mainly due to the sample approximation. We therefore refer to the first term as the Variance. We refer to the second term as the Bias. Our proof of convergence of the bias adapts the proof in [31, Theorem 6] and [11], and utilizes the fact that CY|X is Hilbert-Schmidt to obtain a sharp rate. A.1 Bounding the Bias In this section, we establish the bound on the bias.

artificial intelligence, conditional distribution, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

where the last inequality follows from the fact that Uij 1. Also, for any i [n ] and j [k], we have xi bµj

Neural Information Processing SystemsApr-24-2026, 12:33:15 GMT

To prove Lemma 2 we start by proving a few inequalities. Since Ais an ( 1, 2,Q)-solver, using Definition 4 and Taylor's expansion, we get for any i [n] and j [k], In this section we present and prove a few auxiliary results which will be used in the proofs our main results. We start with the following standard concentration inequalities. R2, (32) if n clog(1/δ)2, where c > 0 is some absolute constant. The following locality lemma states that the fuzzy k-means function is strictly increasing. Lemma 5. Let (X,P?) be a clustering instance, where P? refers to the optimal solution for the fuzzy k-mean problem (namely, minimizes the objective in (2)). Output: bµj 1: Initialize S φ. 2: for s= 1,2,...,mdo 3: Sample iuniformly at random from [n] and update S S {i}. Next, we analyze the performance of Algorithm 6, which estimates the center of a given cluster using a set of randomly sampled elements. Note that this algorithm is used as a sub-routine in Algorithm 1. Lemma 6 (Estimate of mean using uniform sampling). Let (X,P) be a consistent center-based clustering instance, and let δ (0,1).

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Denoising distances beyond the volumetric barrier

Huang, Han, Jiradilok, Pakawut, Mossel, Elchanan

arXiv.org Machine LearningApr-2-2026

We study the problem of reconstructing the latent geometry of a $d$-dimensional Riemannian manifold from a random geometric graph. While recent works have made significant progress in manifold recovery from random geometric graphs, and more generally from noisy distances, the precision of pairwise distance estimation has been fundamentally constrained by the volumetric barrier, namely the natural sample-spacing scale $n^{-1/d}$ coming from the fact that a generic point of the manifold typically lies at distance of order $n^{-1/d}$ from the nearest sampled point. In this paper, we introduce a novel approach, Orthogonal Ring Distance Estimation Routine (ORDER), which achieves a pointwise distance estimation precision of order $n^{-2/(d+5)}$ up to polylogarithmic factors in $n$ in polynomial time. This strictly beats the volumetric barrier for dimensions $d > 5$. As a consequence of obtaining pointwise precision better than $n^{-1/d}$, we prove that the Gromov--Wasserstein distance between the reconstructed metric measure space and the true latent manifold is of order $n^{-1/d}$. This matches the Wasserstein convergence rate of empirical measures, demonstrating that our reconstructed graph metric is asymptotically as good as having access to the full pairwise distance matrix of the sampled points. Our results are proven in a very general setting which includes general models of noisy pairwise distances, sparse random geometric graphs, and unknown connection probability functions.

artificial intelligence, machine learning, pakawutjiradilok, (15 more...)

arXiv.org Machine Learning

2604.00432

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Missouri > Boone County > Columbia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

ac8ec9b4d94c03f0af8c4fe3d5fad4fd-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 09:23:18 GMT

machine learning, natural language, optimization, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

On the Adversarial Robustness of Benjamini Hochberg

Neural Information Processing SystemsFeb-17-2026, 05:02:31 GMT

The Benjamini-Hochberg (BH) procedure is widely used to control the false detection rate (FDR) in multiple testing.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > Monterey County > Monterey (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.73)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Parameterized Approximation Schemes for Fair-Range Clustering

Neural Information Processing SystemsFeb-15-2026, 17:19:42 GMT

It imposes lower and upper bound constraints on the number of facilities opened for each label, ensuring fair representation of all demographic groups by the selected facilities.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China > Hunan Province (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback