AITopics | bayes-optimal error

Collaborating Authors

bayes-optimal error

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sharp feature-learning transitions and Bayes-optimal neural scaling laws in extensive-width networks

Nguyen, Minh-Toan, Barbier, Jean

arXiv.org Machine LearningMay-13-2026

We study the information-theoretic limits of learning a one-hidden-layer teacher network with hierarchical features from noisy queries, in the context of knowledge transfer to a smaller student model. We work in the high-dimensional regime where the teacher width $k$ scales linearly with the input dimension $d$ -- a setting that captures large-but-finite-width networks and has only recently become analytically tractable. Using a heuristic leave-one-out decoupling argument, validated numerically throughout, we derive asymptotically sharp characterizations of the Bayes-optimal generalization error and individual feature overlaps via a system of closed fixed-point equations. These equations reveal that feature learnability is governed by a sequence of sharp phase transitions: as data grows, teacher features become recoverable sequentially, each through a discontinuous jump in overlap. This sequential acquisition underlies a precise notion of \textit{effective width} $k_c$ -- the number of learnable features at a given data budget $n$ -- which unifies two distinct scaling regimes: a feature-learning regime in which the Bayes-optimal generalization error $\varepsilon^{\rm BO}$ scales as $ n^{1/(2β)-1}$, and a refinement regime in which it scales as $n^{-1}$, where $β>1/2$ is the exponent of the power-law feature hierarchy. Both laws collapse to the single relation $\varepsilon^{\rm BO}=Θ(k_c d/n)$. We further show empirically that a student trained with \textsc{Adam} near the effective width $k_c$ achieves these optimal scaling laws (up to a small algorithmic gap), and provide an information-theoretic account of the associated scaling in model size.

artificial intelligence, machine learning, regime, (15 more...)

arXiv.org Machine Learning

2605.10395

Country: Europe (0.28)

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Classifier-independent Lower-Bounds for Adversarial Robustness

Dohmatob, Elvis

arXiv.org Machine LearningNov-9-2020

We theoretically analyse the limits of robustness to test-time adversarial and noisy examples in classification. Our work focuses on deriving bounds which uniformly apply to all classifiers (i.e all measurable functions from features to labels) for a given problem. Our contributions are two-fold. (1) We use optimal transport theory to derive variational formulae for the Bayes-optimal error a classifier can make on a given classification problem, subject to adversarial attacks. The optimal adversarial attack is then an optimal transport plan for a certain binary cost-function induced by the specific attack model, and can be computed via a simple algorithm based on maximal matching on bipartite graphs. (2) We derive explicit lower-bounds on the Bayes-optimal error in the case of the popular distance-based attacks. These bounds are universal in the sense that they depend on the geometry of the class-conditional distributions of the data, but not on a particular classifier. Our results are in sharp contrast with the existing literature, wherein adversarial vulnerability of classifiers is derived as a consequence of nonzero ordinary test error.

adversarial attack, bayes-optimal error, classifier, (16 more...)

arXiv.org Machine Learning

2006.09989

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Research Report (0.84)

Industry:

Government > Military (0.56)
Information Technology > Security & Privacy (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The role of regularization in classification of high-dimensional noisy Gaussian mixture

Mignacco, Francesca, Krzakala, Florent, Lu, Yue M., Zdeborová, Lenka

arXiv.org Machine LearningFeb-26-2020

We consider a high-dimensional mixture of two Gaussians in the noisy regime where even an oracle knowing the centers of the clusters misclassifies a small but finite fraction of the points. We provide a rigorous analysis of the generalization error of regularized convex classifiers, including ridge, hinge and logistic regression, in the high-dimensional limit where the number $n$ of samples and their dimension $d$ go to infinity while their ratio is fixed to $\alpha= n/d$. We discuss surprising effects of the regularization that in some cases allows to reach the Bayes-optimal performances. We also illustrate the interpolation peak at low regularization, and analyze the role of the respective sizes of the two clusters.

equation, generalization error, regularization, (16 more...)

arXiv.org Machine Learning

2002.11544

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback