AITopics | uniform error

Collaborating Authors

uniform error

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Beyond Least Squares: Uniform Approximation and the Hidden Cost of Misspecification

Neural Information Processing SystemsJun-23-2026, 03:11:44 GMT

We study the problem of controlling worst-case errors in misspecified linear regression under the random design setting, where the regression function is estimated via (penalized) least-squares. This setting arises naturally in value function approximation for bandit algorithms and reinforcement learning (RL). Our first main contribution is the observation that the amplification of the misspecification error when using least-squares is governed by the Lebesgue constant, a classical quantity from approximation theory that depends on the choice of the feature subspace and the covariate distribution. We also show that this dependence on the misspecification error is tight for least-squares regression: in general, no method minimizing the empirical squared loss, including regularized least-squares, can improve it substantially. We argue this explains the empirical observation that some feature-maps (e.g., those derived from the Fourier bases) "work better in RL" than others (e.g., polynomials): given some covariate distribution, the Lebesgue constant is known to be highly sensitive to choice of the feature-map. As a second contribution, we propose a method that augments the original feature set with auxiliary features designed to reduce the error amplification. We then prove that the method successfully competes with an "oracle" that knows the best way of using the auxiliary features to reduce this amplification. For example, when the domain is a real interval and the features are monomials, our method reduces the amplification factor to O(1)as d, while without our method, least-squares with the monomials (and in fact polynomials) will suffer a worst-case error amplification of order Ω(d). It follows that there are functions and feature maps for which our method is consistent, while least-squares is inconsistent.

artificial intelligence, lebesgue constant, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Uniform Error Bounds for Gaussian Process Regression with Application to Safe Control

Armin Lederer, Jonas Umlauft, Sandra Hirche

Neural Information Processing SystemsFeb-15-2026, 09:44:41 GMT

Key to the application of such models in safety-critical domains is the quantification of their model error. Gaussian processes provide such a measure anduniform error bounds havebeen derived,which allowsafe control based on thesemodels.

artificial intelligence, machine learning, process regression, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Estimating the Error of Large Language Models at Pairwise Text Comparison

Li, Tianyi

arXiv.org Artificial IntelligenceOct-28-2025

We measure LLMs' output error at pairwise text comparison, noting the probability of error in their preferences. Our method does not rely on the ground truth and supports two scenarios: (i) uniform error rate regardless of the order of comparison, estimated with two comparisons for each text pair with either text placed first; (ii) binary positional bias assuming distinct error rates for the two orders of comparison, estimated with repeated comparisons between the texts. The Copeland counting constructs a ranking over the compared texts from pairwise preferences; the ranking reveals the poor scalability of LLM-based pairwise comparison and helps yield the estimates for LLMs' error rates. We apply the method to six LLMs (ChatGPT, Claude, DeepSeek, Gemini, Grok, Qwen) with five types of text input and obtain consistent estimates of LLMs' error. In general, the measured two positional bias terms are similar, close to the uniform error. Considering both the error rates and the robustness to the variation of prompts, Claude obtained the most desirable performance in this experiment. Our model outperforms the biased Bradley-Terry model and the commutativity score in indicating LLMs' error at this task.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.22219

Country: North America > United States (1.00)

Genre: Research Report (0.81)

Industry:

Transportation (0.68)
Government > Regional Government > North America Government > United States Government (0.68)
Health & Medicine (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uniform Error Bounds for Gaussian Process Regression with Application to Safe Control

Armin Lederer, Jonas Umlauft, Sandra Hirche

Neural Information Processing SystemsAug-20-2025, 11:22:36 GMT

Neural Information Processing Systems http://nips.cc/

gaussian process, theorem 3, uniform error, (12 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Thank you for raising the interesting question on the conditions for asymptotic

Neural Information Processing SystemsAug-20-2025, 11:22:21 GMT

This is achieved e.g. if a constant fraction of all samples lies on the point Theorem 3.3 by reformulating lines 190-191 as follows: "Furthermore, consider an infinite data stream of observations ". Making Theorem 3.3 quantitative as suggested by Reviewer #2 Although unbounded, they grow slow enough to allow the proof of Theorem 3.3 such that the main We will add a brief discussion on this in the updated paper. Reviewer #1 pointed out, that Assumption 3.1. Therefore, Assumption 3.1 is valid for our experimental setup. We will include the given reasoning in the updated paper.

interesting question, reviewer, theorem 3, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

An Analysis of Safety Guarantees in Multi-Task Bayesian Optimization

Luebsen, Jannis O., Eichler, Annika

arXiv.org Artificial IntelligenceMar-17-2025

--This paper addresses the integration of additional information sources into a Bayesian optimization framework while ensuring that safety constraints are satisfied. The interdependencies between these information sources are modeled using an unknown correlation matrix. We explore how uniform error bounds must be adjusted to maintain constraint satisfaction throughout the optimization process, considering both Bayesian and frequentist statistical perspectives. This is achieved by appropriately scaling the error bounds based on a confidence interval that can be estimated from the data. Furthermore, the efficacy of the proposed approach is demonstrated through experiments on two benchmark functions and a controller parameter optimization problem. Our results highlight a significant improvement in sample efficiency, demonstrating the method's suitability for optimizing expensive-to-evaluate functions. Many practical optimization problems can be formulated as the optimization of a black-box function, e. g., because of their complex underlying physics or the requirement of impractical identification processes. Black-box optimization algorithms bypass the need of models for optimizations. In essence, these algorithms sequentially evaluate the black-box function for some input while reducing the cost. In the last decade, Bayesian optimization (BO) has emerged as a promising method for solving exactly this set of problems. This method involves constructing a probabilistic surrogate model of an arbitrary objective function with minimal assumptions. The utilization of Gaussian processes (GPs) enables the incorporation of prior knowledge about the objective function, making BO particularly well-suited for scenarios where function evaluations are costly and observations may be noisy. As a simple example of BO, consider the optimization of a PID controller for unit step reference tracking, where the plant dynamics are unknown. A potential cost function that measures tracking accuracy could be the mean-squared error of the plant output and the step reference for a designated time window. The black-box function is now the function that maps the PID parameters to the image of the cost function. An evaluation corresponds to running the step response of the system with the specified PID parameters.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.08555

Country:

Europe > Germany > Hamburg (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Learning Best-in-Class Policies for the Predict-then-Optimize Framework

Huang, Michael, Gupta, Vishal

arXiv.org Artificial IntelligenceFeb-8-2024

We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. These losses directly approximate the downstream decision loss and can be optimized using off-the-shelf gradient-based methods. Importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows. This implies that optimizing our surrogate loss yields a best-in-class policy asymptotically, even in misspecified settings. This is the first such result in misspecified settings and we provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified and the noise is not centrally symmetric. Insofar as misspecification is commonplace in practice -- especially when we might prefer a simpler, more interpretable model -- PG losses offer a novel, theoretically justified, method for computationally tractable decision-aware learning.

learning best-in-class policy, pg loss, theorem 3, (16 more...)

arXiv.org Artificial Intelligence

2402.03256

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Apulia > Bari (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Regret Optimality of GP-UCB

Wang, Wenjia, Zhang, Xiaowei, Zou, Lu

arXiv.org Machine LearningDec-3-2023

Gaussian Process Upper Confidence Bound (GP-UCB) is one of the most popular methods for optimizing black-box functions with noisy observations, due to its simple structure and superior performance. Its empirical successes lead to a natural, yet unresolved question: Is GP-UCB regret optimal? In this paper, we offer the first generally affirmative answer to this important open question in the Bayesian optimization literature. We establish new upper bounds on both the simple and cumulative regret of GP-UCB when the objective function to optimize admits certain smoothness property. These upper bounds match the known minimax lower bounds (up to logarithmic factors independent of the feasible region's dimensionality) for optimizing functions with the same smoothness. Intriguingly, our findings indicate that, with the same level of exploration, GP-UCB can simultaneously achieve optimality in both simple and cumulative regret. The crux of our analysis hinges on a refined uniform error bound for online estimation of functions in reproducing kernel Hilbert spaces. This error bound, which we derive from empirical process theory, is of independent interest, and its potential applications may reach beyond the scope of this study.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2312.01386

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Uniform Error and Posterior Variance Bounds for Gaussian Process Regression with Application to Safe Control

Lederer, Armin, Umlauft, Jonas, Hirche, Sandra

arXiv.org Machine LearningJan-13-2021

In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bounds rely on prior knowledge, which might not be available for many real-world tasks. (ii) The relationship between training data and the posterior variance, which mainly drives the error bound, is not well understood and prevents the asymptotic analysis. This article addresses these issues by presenting a novel uniform error bound using Lipschitz continuity and an analysis of the posterior variance function for a large class of kernels. Additionally, we show how these results can be used to guarantee safe control of an unknown dynamical system and provide numerical illustration examples.

gaussian process, kernel, posterior variance, (12 more...)

arXiv.org Machine Learning

2101.05328

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback

Multiclass Classification via Class-Weighted Nearest Neighbors

Khim, Justin, Xu, Ziyu, Singh, Shashank

arXiv.org Machine LearningApr-9-2020

Classification is a fundamental problem in statistics and machine learning that arises in many scientific and engineering problems. Scientific applications include identifying plant and animal species from body measurements, determining cancer types based on gene expression, and satellite image processing (Fisher, 1936, 1938; Khan et al., 2001; Lee et al., 2004); in modern engineering contexts, credit card fraud detection, handwritten digit recognition, word sense disambiguation, and object detection in images are all examples of classification tasks. These applications have brought two new challenges: multiclass classification with a potentially large number of classes and imbalanced data. For example, in online retailing, websites have hundreds of thousands or millions of products, and they may like to categorize these products within a preexisting taxonomy based on product descriptions (Lin et al., 2018). While the number of classes alone makes the problem difficult, an added difficulty with text data is that it is usually highly imbalanced, meaning that a few classes may constitute a large fraction of the data while many classes have only a few examples. In fact, Feldman (2019) notes that if the data follows the classical Zipf distribution for text data (Zipf, 1936), i.e., the class probabilities satisfy a power-law distribution, then up to 35% of seen examples may appear only once in the training data. Additionally, natural image data also seems to have the problems of many classes and imbalanced data (Salakhutdinov et al., 2011; Zhu et al., 2014). Focusing on the problem of imbalanced data, researchers have found that a few heuristics help "do better," and the most principled and studied of these is weighting. There are a number of forms of weighting; we consider the most basic in which we incur a loss of weight for misclassifying an example of class and refer to this method as class-weighting.

classification, prec, probability, (15 more...)

arXiv.org Machine Learning

2004.04715

Country: