AITopics | dirichlet distribution

Collaborating Authors

dirichlet distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Courtroom Analogy: New Perspective on Uncertainty-Aware Classification

Yoon, Taeseong, Kim, Heeyoung

arXiv.org Machine LearningMay-26-2026

Single-pass uncertainty quantification (UQ) methods for classification represent uncertainty by predicting a tractable distribution over the class probability vector. While existing approaches primarily focus on enhancing the expressiveness of this distribution, they often provide limited insight into how predictive uncertainty is structured and aggregated, resulting in weak interpretability. We introduce the courtroom analogy, which conceptualizes uncertainty-aware classification as a structured debate among class-specific advocates. Each advocate forms a probabilistic opinion, and a final verdict is reached by aggregating these opinions using input-dependent plausibility weights. In this framework, each advocate's opinion is modeled as a Dirichlet distribution whose concentration parameter is decomposed into shared evidence and class-specific advocacy. This yields a structured mixture of Dirichlet distributions with semantically interpretable parameters. To instantiate this formulation, we propose Mixture of Dirichlet EXperts (MoDEX), a single-pass neural architecture that predicts the courtroom parameters, enabling efficient and expressive UQ while explicitly modeling uncertainty aggregation. We demonstrate that MoDEX enjoys strong theoretical properties and achieves state-of-the-art UQ performance across diverse benchmarks, yielding interpretable uncertainty estimates with meaningful semantics.

artificial intelligence, machine learning, modex, (14 more...)

arXiv.org Machine Learning

2605.25616

Country: Asia > South Korea (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Law > Litigation (0.96)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dirichlet-Based Monte Carlo Dropout for Uncertainty Estimation in Neural Networks

Hoblos, Rouaa, Dridi, Noura, Zerhouni, Noureddine, Masry, Zeina Al

arXiv.org Machine LearningMay-25-2026

Traditional neural networks provide deterministic predictions without inherent uncertainty estimates. While Bayesian Neural Networks (BNNs) offer a principled approach to uncertainty quantification, their computational complexity limits scalability. Monte Carlo (MC) Dropout, initially introduced as a regularization technique, has been shown to approximate Bayesian inference by enabling probabilistic modeling through multiple stochastic forward passes. In this work, we enhance uncertainty estimation in deep learning by integrating a Dirichlet-based framework within MC Dropout. Specifically, we leverage the formulation proposed by Sensoy et al. (2018), where class probabilities are modeled using a Dirichlet distribution, allowing for a more informative uncertainty representation. The proposed approach maintains the computational efficiency of MC Dropout while improving the quality of uncertainty estimates. We discuss the theoretical foundations of our method and compare it with existing uncertainty quantification techniques. The results highlight the effectiveness of the proposed method in producing well-calibrated uncertainty estimates, offering a practical solution for uncertainty-aware deep learning models.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2605.23635

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)

Add feedback

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

Neural Information Processing SystemsApr-25-2026, 20:00:13 GMT

We consider reinforcement learning in an environment modeled by an episodic, finite, stage-dependent Markov decision process of horizon H with S states, and A actions. The performance of an agent is measured by the regret after interacting with the environment for T episodes. We propose an optimistic posterior sampling algorithm for reinforcement learning (OPSRL), a simple variant of posterior sampling that only needs a number of posterior samples logarithmic in H, S, A, and T per state-action pair.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.28)

Genre:

Research Report (0.47)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

08f90c1a417155361a5c4b8d297e0d78-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 14:13:50 GMT

Now consider a perturbation of the prior distribution over transition functions δ: T R 0 such that R Tp δ(Tp)P(Tp|h0)dTp = 1. Proof: Proposition 2 directly extends Proposition 1 in [8] to BAMDPs. Therefore, the perturbed distribution over histories is also a valid probability distribution. Provided that cbo is chosen appropriately (details in the appendix), as the number of perturbations expanded approaches, a perturbation within any > 0 of the optimal perturbation will be expanded by the Bayesian optimisation procedure with probability 1 δ. Proof: Consider an adversary decision node, v, associated with augmented state (s,ha,y) in the BACVaR-SG. We begin by proving that Q((s,ha,y),ξ) is continuous with respect to ξ. Define a function d: S R, such that ξ + d produces a valid adversary perturbation.

artificial intelligence, machine learning, perturbation, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Details

Neural Information Processing SystemsApr-24-2026, 12:47:31 GMT

To keep experiments uniform, for all datasets (STL-10, CIFAR-10, and CIFAR-100) we used a train/val/test partitioning. In our experiments we compared FED with four baselines. For all baselines we tried different learning rates [0.1, 0.01, 0.001] and batch sizes [32, 64, 100]. For EnDD and EnDD + AUX, we used the same temperature, temperature annealing, and optimizer that was used in the original paper. For AMT, we tried different alphas [1e1, 1e3, 1e5] and kept the rest as the original paper.

agreementece, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Functional Ensemble Distillation

Neural Information Processing SystemsApr-24-2026, 12:47:27 GMT

Bayesian models have many desirable properties, most notable is their ability to generalize from limited data and to properly estimate the uncertainty in their predictions. However, these benefits come at a steep computational cost as Bayesian inference, in most cases, is computationally intractable. One popular approach to alleviate this problem is using a Monte-Carlo estimation with an ensemble of models sampled from the posterior. However, this approach still comes at a significant computational cost, as one needs to store and run multiple models at test time. In this work, we investigate how to best distill an ensemble's predictions using an efficient model.

Add feedback

0415740eaa4d9decbc8da001d3fd805f-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 11:08:55 GMT

artificial intelligence, machine learning, test error, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound

Neural Information Processing SystemsApr-24-2026, 11:08:51 GMT

We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk, which then turns the generalization bound into a tractable training objective. The resulting stochastic majority vote learning algorithm achieves state-of-the-art accuracy and benefits from (non-vacuous) tight generalization bounds, in a series of numerical experiments when compared to competing algorithms which also minimize PACBayes objectives - both with uninformed (data-independent) and informed (datadependent) priors.

artificial intelligence, classifier, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre:

Research Report (0.68)
Instructional Material (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)

Add feedback

Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

Franzen, Courtney, Pourkamali-Anaraki, Farhad

arXiv.org Machine LearningApr-8-2026

Neural network classifiers trained with cross-entropy loss achieve strong predictive accuracy but lack the capability to provide inherent predictive uncertainty estimates, thus requiring external techniques to obtain these estimates. In addition, softmax scores for the true class can vary substantially across independent training runs, which limits the reliability of uncertainty-based decisions in downstream tasks. Evidential Deep Learning aims to address these limitations by producing uncertainty estimates in a single pass, but evidential training is highly sensitive to design choices including loss formulation, prior regularization, and activation functions. Therefore, this work introduces an alternative Dirichlet parameter estimation strategy by applying a method of moments estimator to ensembles of softmax outputs, with an optional maximum-likelihood refinement step. This ensemble-based construction decouples uncertainty estimation from the fragile evidential loss design while also mitigating the variability of single-run cross-entropy training, producing explicit Dirichlet predictive distributions. Across multiple datasets, we show that the improved stability and predictive uncertainty behavior of these ensemble-derived Dirichlet estimates translate into stronger performance in downstream uncertainty-guided applications such as prediction confidence scoring and selective classification.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

2604.06032

Country: