AITopics | ensemble member

Knowledge Distillation of Uncertainty using Deep Latent Factor Model

Neural Information Processing SystemsJun-18-2026, 20:38:01 GMT

Deep ensembles deliver state-of-the-art, reliable uncertainty quantification, but their heavy computational and memory requirements hinder their practical deployments to real applications such as on-device AI. Knowledge distillation compresses an ensemble into small student models, but existing techniques struggle to preserve uncertainty partly because reducing the size of DNNs typically results in variation reduction. To resolve this limitation, we introduce a new method of distribution distillation (i.e.

distillation, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.67)
(2 more...)

Add feedback

466473650870501e3600d9a1b4ee5d44-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 16:36:20 GMT

artificial intelligence, machine learning, perturbation, (17 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.28)

Industry: Education (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Functional Ensemble Distillation

Neural Information Processing SystemsApr-24-2026, 12:47:27 GMT

Bayesian models have many desirable properties, most notable is their ability to generalize from limited data and to properly estimate the uncertainty in their predictions. However, these benefits come at a steep computational cost as Bayesian inference, in most cases, is computationally intractable. One popular approach to alleviate this problem is using a Monte-Carlo estimation with an ensemble of models sampled from the posterior. However, this approach still comes at a significant computational cost, as one needs to store and run multiple models at test time. In this work, we investigate how to best distill an ensemble's predictions using an efficient model.

Add feedback

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

Stefan Lee, Senthil Purushwalkam Shiva Prakash, Michael Cogswell, Viresh Ranjan, David Crandall, Dhruv Batra

Neural Information Processing SystemsMar-23-2026, 03:37:22 GMT

Many practical perception systems exist within larger processes that include interactions with users or additional components capable of evaluating the quality of predicted solutions. In these contexts, it is beneficial to provide these oracle mechanisms with multiple highly likely hypotheses rather than a single prediction. In this work, we pose the task of producing multiple outputs as a learning problem over an ensemble of deep networks - introducing a novel stochastic gradient descent based approach to minimize the loss with respect to an oracle. Our method is simple to implement, agnostic to both architecture and loss function, and parameter-free. Our approach achieves lower oracle error compared to existing methods on a wide range of tasks and deep architectures. We also show qualitatively that the diverse solutions produced often provide interpretable representations of task ambiguity.

artificial intelligence, ensemble, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.69)

Industry:

Education (0.86)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

JUCAL: Jointly Calibrating Aleatoric and Epistemic Uncertainty in Classification Tasks

Heiss, Jakob, Lambrecht, Sören, Weissteiner, Jakob, Wutte, Hanna, Žurič, Žan, Teichmann, Josef, Yu, Bin

arXiv.org Machine LearningFeb-24-2026

We study post-calibration uncertainty for trained ensembles of classifiers. Specifically, we consider both aleatoric (label noise) and epistemic (model) uncertainty. Among the most popular and widely used calibration methods in classification are temperature scaling (i.e., pool-then-calibrate) and conformal methods. However, the main shortcoming of these calibration methods is that they do not balance the proportion of aleatoric and epistemic uncertainty. Not balancing these uncertainties can severely misrepresent predictive uncertainty, leading to overconfident predictions in some input regions while being underconfident in others. To address this shortcoming, we present a simple but powerful calibration algorithm Joint Uncertainty Calibration (JUCAL) that jointly calibrates aleatoric and epistemic uncertainty. JUCAL jointly calibrates two constants to weight and scale epistemic and aleatoric uncertainties by optimizing the negative log-likelihood (NLL) on the validation/calibration dataset. JUCAL can be applied to any trained ensemble of classifiers (e.g., transformers, CNNs, or tree-based methods), with minimal computational overhead, without requiring access to the models' internal parameters. We experimentally evaluate JUCAL on various text classification tasks, for ensembles of varying sizes and with different ensembling strategies. Our experiments show that JUCAL significantly outperforms SOTA calibration methods across all considered classification tasks, reducing NLL and predictive set size by up to 15% and 20%, respectively. Interestingly, even applying JUCAL to an ensemble of size 5 can outperform temperature-scaled ensembles of size up to 50 in terms of NLL and predictive set size, resulting in up to 10 times smaller inference costs. Thus, we propose JUCAL as a new go-to method for calibrating ensembles in classification.

large language model, machine learning, natural language, (22 more...)

arXiv.org Machine Learning

2602.20153

Country:

North America > United States (0.46)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Government (0.45)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Add feedback

65cc2c8205a05d7379fa3a6386f710e1-Supplemental.pdf

Neural Information Processing SystemsFeb-19-2026, 03:52:48 GMT

architecture, experiment, information, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Colorado > Boulder County > Boulder (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance

Neural Information Processing SystemsFeb-18-2026, 18:02:44 GMT

Building on this, our study investigates the interplay between sharpness and diversity within deep ensembles, illustrating their crucial role in robust generalization to both in-distribution (ID) and out-of-distribution (OOD) data.

artificial intelligence, ensemble, machine learning, (19 more...)

Neural Information Processing Systems

Country: