AITopics | steinwart

Collaborating Authors

steinwart

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Adversarial Consistency of Surrogate Risks for Binary Classification

Neural Information Processing SystemsApr-28-2026, 20:00:16 GMT

We study the consistency of surrogate risks for robust binary classification. It is common to learn robust classifiers by adversarial training, which seeks to minimize the expected 0-1 loss when each example can be maliciously corrupted within a small ball. We give a simple and complete characterization of the set of surrogate loss functions that are consistent, i.e., that can replace the 0-1loss without affecting the minimizing sequences of the original adversarial risk, for any data distribution. We also prove a quantitative version of adversarial consistency for the ρ-margin loss. Our results reveal that the class of adversarially consistent surrogates is substantially smaller than in the standard setting, where many common surrogates are known to be consistent.

artificial intelligence, consistency, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Calibration and Consistency of Adversarial Surrogate Losses

Neural Information Processing SystemsApr-25-2026, 21:41:53 GMT

Adversarial robustness is an increasingly critical property of classifiers in applications. The design of robust algorithms relies on surrogate losses since the optimization of the adversarial loss with most hypothesis sets is NP-hard. But, which surrogate losses should be used and when do they benefit from theoretical guarantees? We present an extensive study of this question, including a detailed analysis of the H-calibration and H-consistency of adversarial surrogate losses. We show that convex loss functions, or the supremum-based convex losses often used in applications, are not H-calibrated for common hypothesis sets used in machine learning.

artificial intelligence, hypothesis, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)

Add feedback

514a70448c235ccb8b6842ef5e02ad3b-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 21:41:50 GMT

artificial intelligence, machine learning, surrogate loss, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Learning-to-Defer with Expert-Conditioned Advice

Montreuil, Yannis, Montreuil, Leïna, Carlier, Axel, Ng, Lai Xing, Ooi, Wei Tsang

arXiv.org Machine LearningMar-20-2026

Learning-to-Defer routes each input to the expert that minimizes expected cost, but it assumes that the information available to every expert is fixed at decision time. Many modern systems violate this assumption: after selecting an expert, one may also choose what additional information that expert should receive, such as retrieved documents, tool outputs, or escalation context. We study this problem and call it Learning-to-Defer with advice. We show that a broad family of natural separated surrogates, which learn routing and advice with distinct heads, is inconsistent even in the smallest non-trivial setting. We then introduce an augmented surrogate that operates on the composite expert--advice action space and prove an $\mathcal{H}$-consistency guarantee together with an excess-risk transfer bound, yielding recovery of the Bayes-optimal policy in the limit. Experiments on tabular, language, and multi-modal tasks show that the resulting method improves over standard Learning-to-Defer while adapting its advice-acquisition behavior to the cost regime; a synthetic benchmark confirms the failure mode predicted for separated surrogates.

justification, large language model, machine learning, (20 more...)

arXiv.org Machine Learning

2603.14324

Country:

Asia > Singapore (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Monaco (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Self-Regularized Learning Methods

Schölpple, Max, Fanghui, Liu, Steinwart, Ingo

arXiv.org Machine LearningMar-19-2026

We introduce a general framework for analyzing learning algorithms based on the notion of self-regularization, which captures implicit complexity control without requiring explicit regularization. This is motivated by previous observations that many algorithms, such as gradient-descent based learning, exhibit implicit regularization. In a nutshell, for a self-regularized algorithm the complexity of the predictor is inherently controlled by that of the simplest comparator achieving the same empirical risk. This framework is sufficiently rich to cover both classical regularized empirical risk minimization and gradient descent. Building on self-regularization, we provide a thorough statistical analysis of such algorithms including minmax-optimal rates, where it suffices to show that the algorithm is self-regularized -- all further requirements stem from the learning problem itself. Finally, we discuss the problem of data-dependent hyperparameter selection, providing a general result which yields minmax-optimal rates up to a double logarithmic factor and covers data-driven early stopping for RKHS-based gradient descent.

artificial intelligence, assumption, machine learning, (17 more...)

arXiv.org Machine Learning

2603.1716

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.74)

Add feedback

Calibration and Consistency of Adversarial Surrogate Losses

Neural Information Processing SystemsFeb-8-2026, 16:06:21 GMT

artificial intelligence, hypothesis, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

514a70448c235ccb8b6842ef5e02ad3b-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 16:06:18 GMT

arxiv preprint arxiv, hypothesis, surrogate loss, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Nonparametric Instrumental Variable Regression with Observed Covariates

Shen, Zikai, Chen, Zonghao, Meunier, Dimitri, Steinwart, Ingo, Gretton, Arthur, Li, Zhu

arXiv.org Machine LearningNov-25-2025

We study the problem of nonparametric instrumental variable regression with observed covariates, which we refer to as NPIV-O. Compared with standard nonparametric instrumental variable regression (NPIV), the additional observed covariates facilitate causal identification and enables heterogeneous causal effect estimation. However, the presence of observed covariates introduces two challenges for its theoretical analysis. First, it induces a partial identity structure, which renders previous NPIV analyses - based on measures of ill-posedness, stability conditions, or link conditions - inapplicable. Second, it imposes anisotropic smoothness on the structural function. To address the first challenge, we introduce a novel Fourier measure of partial smoothing; for the second challenge, we extend the existing kernel 2SLS instrumental variable algorithm with observed covariates, termed KIV-O, to incorporate Gaussian kernel lengthscales adaptive to the anisotropic smoothness. We prove upper $L^2$-learning rates for KIV-O and the first $L^2$-minimax lower learning rates for NPIV-O. Both rates interpolate between known optimal rates of NPIV and nonparametric regression (NPR). Interestingly, we identify a gap between our upper and lower bounds, which arises from the choice of kernel lengthscales tuned to minimize a projected risk. Our theoretical analysis also applies to proximal causal inference, an emerging framework for causal effect estimation that shares the same conditional moment restriction as NPIV-O.

assumption 2, assumption 4, regression, (13 more...)

arXiv.org Machine Learning

2511.19404

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

81858558b55a8c63763cfe088090242a-Paper-Conference.pdf

Neural Information Processing SystemsNov-18-2025, 20:42:56 GMT

artificial intelligence, consistency, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Neural Information Processing SystemsOct-9-2025, 02:29:08 GMT

The widely observed'benign overfitting phenomenon' in the neural network literature raises the challenge to the'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent

artificial intelligence, machine learning, nullt 1 2, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
North America > United States > Wisconsin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

Add feedback