AITopics | error probability

Collaborating Authors

error probability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

27e9661e033a73a6ad8cefcde965c54d-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 04:57:07 GMT

artificial intelligence, decision tree, hypothesis, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Sequential Audit Sampling with Statistical Guarantees

Kato, Masahiro, Nakagawa, Kei

arXiv.org Machine LearningApr-8-2026

Financial statement auditing is conducted under a risk-based evidence approach to obtain reasonable assurance. In practice, auditors often perform additional sampling or related procedures when an initial sample does not provide a sufficient basis for a conclusion. Across jurisdictions, current standards and practice manuals acknowledge such extensions, while the statistical design of sequential audit procedures has not been fully explored. This study formulates audit sampling with additional, sequentially collected items as a sequential testing problem for a finite population under sampling without replacement. We define null and alternative hypotheses in terms of a tolerable deviation rate, specify stopping and decision rules, and formulate exact sequential boundary conditions in terms of finite-population error probabilities. For practical implementation, we calibrate those boundaries by Monte Carlo simulation at least-favorable deviation rates. The exact design yields ex ante control of decision error probabilities, and the simulation-based implementation approximates that design while allowing the computation of expected stopping times. The framework is most naturally suited to attribute auditing and deviation-rate auditing, especially tests of controls, and it can be extended to one-sided, two-stage, and truncated designs.

artificial intelligence, boundary, machine learning, (16 more...)

arXiv.org Machine Learning

2604.06116

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.04)
Asia > Malaysia (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Government > Regional Government > North America Government > United States Government (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Improved Algorithms for Collaborative PAC Learning

Huy Nguyen, Lydia Zakynthinou

Neural Information Processing SystemsFeb-12-2026, 14:52:38 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, classifier, ln null 1, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.83)

Add feedback

Don't Always Pick the Highest-Performing Model: An Information Theoretic View of LLM Ensemble Selection

Turkmen, Yigit, Buyukates, Baturalp, Bastopcu, Melih

arXiv.org Machine LearningFeb-10-2026

Large language models (LLMs) are often ensembled together to improve overall reliability and robustness, but in practice models are strongly correlated. This raises a fundamental question: which models should be selected when forming an LLM ensemble? We formulate budgeted ensemble selection as maximizing the mutual information between the true label and predictions of the selected models. Furthermore, to explain why performance can saturate even with many models, we model the correlated errors of the models using Gaussian-copula and show an information-theoretic error floor for the performance of the ensemble. Motivated by these, we propose a simple greedy mutual-information selection algorithm that estimates the required information terms directly from data and iteratively builds an ensemble under a query budget. We test our approach in two question answering datasets and one binary sentiment classification dataset: MEDMCQA, MMLU, and IMDB movie reviews. Across all datasets, we observe that our method consistently outperforms strong baselines under the same query budget.

large language model, machine learning, submission and formatting instruction, (19 more...)

arXiv.org Machine Learning

2602.08003

Country:

North America > United States > Illinois (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

1e2dd2f1efbc6e65b68f17ce6e158b34-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 03:14:41 GMT

estimator, query, response time, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Robots (0.93)
(3 more...)

Add feedback

MinimaxOptimalFixed-BudgetBestArm IdentificationinLinearBandits

Neural Information Processing SystemsFeb-8-2026, 21:58:04 GMT

Complementarily, we present a minimax lower bound for this problem.

artificial intelligence, bandit, data mining, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

672cf3025399742b1a047c8dc6b1e992-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-8-2026, 17:25:40 GMT

We would like to express our sincere gratitude to the reviewers for providing their valuable feedback. This generalization will be added to the revision. We will clarify this point together with further experiments on purely real datasets in a revision. This can readily be obtained by [39, 40] which do not exploit the hierarchical structure. We will provide this discussion in a revision.

artificial intelligence, rating vector, revision, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Add feedback

A Distribution Testing Approach to Clustering Distributions

Kumar, Gunjan, Pote, Yash, Scarlett, Jonathan

arXiv.org Machine LearningDec-10-2025

We study the following distribution clustering problem: Given a hidden partition of $k$ distributions into two groups, such that the distributions within each group are the same, and the two distributions associated with the two clusters are $\varepsilon$-far in total variation, the goal is to recover the partition. We establish upper and lower bounds on the sample complexity for two fundamental cases: (1) when one of the cluster's distributions is known, and (2) when both are unknown. Our upper and lower bounds characterize the sample complexity's dependence on the domain size $n$, number of distributions $k$, size $r$ of one of the clusters, and distance $\varepsilon$. In particular, we achieve tightness with respect to $(n,k,r,\varepsilon)$ (up to an $O(\log k)$ factor) for all regimes.

complexity, probability, sample complexity, (17 more...)

arXiv.org Machine Learning

2512.08376

Country: Asia > Singapore (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

A Game-Theoretic Approach for Adversarial Information Fusion in Distributed Sensor Networks

Kallas, Kassem

arXiv.org Artificial IntelligenceDec-1-2025

Every day we share our personal information through digital systems which are constantly exposed to threats. For this reason, security-oriented disciplines of signal processing have received increasing attention in the last decades: multimedia forensics, digital watermarking, biometrics, network monitoring, steganography and steganalysis are just a few examples. Even though each of these fields has its own peculiarities, they all have to deal with a common problem: the presence of one or more adversaries aiming at making the system fail. Adversarial Signal Processing lays the basis of a general theory that takes into account the impact that the presence of an adversary has on the design of effective signal processing tools. By focusing on the application side of Adversarial Signal Processing, namely adversarial information fusion in distributed sensor networks, and adopting a game-theoretic approach, this thesis contributes to the above mission by addressing four issues. First, we address decision fusion in distributed sensor networks by developing a novel soft isolation defense scheme that protect the network from adversaries, specifically, Byzantines. Second, we develop an optimum decision fusion strategy in the presence of Byzantines. In the next step, we propose a technique to reduce the complexity of the optimum fusion by relying on a novel near-optimum message passing algorithm based on factor graphs. Finally, we introduce a defense mechanism to protect decentralized networks running consensus algorithm against data falsification attacks.

artificial intelligence, detection and information fusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.23026

Country: