AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Instance Based Approximations to Profile Maximum Likelihood

Neural Information Processing SystemsJan-13-2025, 20:17:09 GMT

In this paper we provide a new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution, a prominent quantity in symmetric property estimation. We provide an algorithm which matches the previous best known efficient algorithms for computing approximate PML distributions and improves when the number of distinct observed frequencies in the given instance is small. We achieve this result by exploiting new sparsity structure in approximate PML distributions and providing a new matrix rounding algorithm, of independent interest. Leveraging this result, we obtain the first provable computationally efficient implementation of PseudoPML, a general framework for estimating a broad class of symmetric properties. Additionally, we obtain efficient PML-based estimators for distributions with small profile entropy, a natural instance-based complexity measure.

algorithm, approximation, profile maximum likelihood, (3 more...)

Neural Information Processing Systems

Country: Asia > Afghanistan > Parwan Province > Charikar (0.10)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Fast sampling and model selection for Bayesian mixture models

Newman, M. E. J.

arXiv.org Machine LearningJan-13-2025

We describe two Monte Carlo algorithms for sampling from the integrated posterior distributions of a range of Bayesian mixture models. Both algorithms allow us to directly sample not only the assignment of observations to components but also the number of components, thereby fitting the model and performing model selection over the number of components in a single computation. The first algorithm is a traditional collapsed Gibbs sampler, albeit with an unusual move-set; the second builds on the first, adding rejection-free sampling from the prior over component assignments, to create an algorithm that has excellent mixing time in typical applications and outperforms current state-of-the-art methods, in some cases by a wide margin. We demonstrate our methods with a selection of applications to latent class analysis.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2501.07668

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > New York (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(8 more...)

Genre: Research Report (0.83)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Impatient Bandits: Optimizing for the Long-Term Without Delay

Zhang, Kelly W., Baldwin-McDonald, Thomas, Ciosek, Kamil, Maystre, Lucas, Russo, Daniel

arXiv.org Machine LearningJan-13-2025

Increasingly, recommender systems are tasked with improving users' long-term satisfaction. In this context, we study a content exploration task, which we formalize as a bandit problem with delayed rewards. There is an apparent trade-off in choosing the learning signal: waiting for the full reward to become available might take several weeks, slowing the rate of learning, whereas using short-term proxy rewards reflects the actual long-term goal only imperfectly. First, we develop a predictive model of delayed rewards that incorporates all information obtained to date. Rewards as well as shorter-term surrogate outcomes are combined through a Bayesian filter to obtain a probabilistic belief. Second, we devise a bandit algorithm that quickly learns to identify content aligned with long-term success using this new predictive model. We prove a regret bound for our algorithm that depends on the \textit{Value of Progressive Feedback}, an information theoretic metric that captures the quality of short-term leading indicators that are observed prior to the long-term reward. We apply our approach to a podcast recommendation problem, where we seek to recommend shows that users engage with repeatedly over two months. We empirically validate that our approach significantly outperforms methods that optimize for short-term proxies or rely solely on delayed rewards, as demonstrated by an A/B test in a recommendation system that serves hundreds of millions of users.

algorithm, progressive feedback, recommendation, (13 more...)

arXiv.org Machine Learning

2501.07761

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Singapore (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Education (0.92)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.87)
(2 more...)

Add feedback

BayesAdapter: enhanced uncertainty estimation in CLIP few-shot adaptation

Morales-Álvarez, Pablo, Christodoulidis, Stergios, Vakalopoulou, Maria, Piantanida, Pablo, Dolz, Jose

arXiv.org Artificial IntelligenceJan-13-2025

The emergence of large pre-trained vision-language models (VLMs) represents a paradigm shift in machine learning, with unprecedented results in a broad span of visual recognition tasks. CLIP, one of the most popular VLMs, has exhibited remarkable zero-shot and transfer learning capabilities in classification. To transfer CLIP to downstream tasks, adapters constitute a parameter-efficient approach that avoids backpropagation through the large model (unlike related prompt learning methods). However, CLIP adapters have been developed to target discriminative performance, and the quality of their uncertainty estimates has been overlooked. In this work we show that the discriminative performance of state-of-the-art CLIP adapters does not always correlate with their uncertainty estimation capabilities, which are essential for a safe deployment in real-world scenarios. We also demonstrate that one of such adapters is obtained through MAP inference from a more general probabilistic framework. Based on this observation we introduce BayesAdapter, which leverages Bayesian inference to estimate a full probability distribution instead of a single point, better capturing the variability inherent in the parameter space. In a comprehensive empirical evaluation we show that our approach obtains high quality uncertainty estimates in the predictions, standing out in calibration and selective classification. Our code will be publicly available upon acceptance of the paper.

artificial intelligence, cvpr, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.09718

Country:

North America > Canada (0.27)
Europe (0.27)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Compact Bayesian Neural Networks via pruned MCMC sampling

Deo, Ratneel, Sisson, Scott, Webster, Jody M., Chandra, Rohitash

arXiv.org Artificial IntelligenceJan-12-2025

Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge. This is mainly due to the problem of sampling multimodal posterior distributions using Markov Chain Monte Carlo (MCMC) sampling and variational inference algorithms. Moreover, the number of model parameters scales exponentially with additional hidden layers, neurons, and features in the dataset. Typically, a significant portion of these densely connected parameters are redundant and pruning a neural network not only improves portability but also has the potential for better generalisation capabilities. In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models having removed redundant parameters. We sample the posterior distribution of model parameters (weights and biases) and prune weights with low importance, resulting in a compact model. We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling. We evaluate the effectiveness of our MCMC pruning strategy on selected benchmark datasets for regression and classification problems through empirical result analysis. We also consider two coral reef drill-core lithology classification datasets to test the robustness of the pruning model in complex real-world datasets. We further investigate if refining compact BNN can retain any loss of performance. Our results demonstrate the feasibility of training and pruning BNNs using MCMC whilst retaining generalisation performance with over 75% reduction in network size. This paves the way for developing compact BNN models that provide uncertainty estimates for real-world applications.

artificial intelligence, machine learning, pruning, (17 more...)

arXiv.org Artificial Intelligence

2501.06962

Country:

North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.93)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning

Chen, Weixin, Yu, Simon, Shao, Huajie, Sha, Lui, Zhao, Han

arXiv.org Artificial IntelligenceJan-12-2025

End-to-end deep neural networks have achieved remarkable success across various domains but are often criticized for their lack of interpretability. While post hoc explanation methods attempt to address this issue, they often fail to accurately represent these black-box models, resulting in misleading or incomplete explanations. To overcome these challenges, we propose an inherently transparent model architecture called Neural Probabilistic Circuits (NPCs), which enable compositional and interpretable predictions through logical reasoning. In particular, an NPC consists of two modules: an attribute recognition model, which predicts probabilities for various attributes, and a task predictor built on a probabilistic circuit, which enables logical reasoning over recognized attributes to make class predictions. To train NPCs, we introduce a three-stage training algorithm comprising attribute recognition, circuit construction, and joint optimization. Moreover, we theoretically demonstrate that an NPC's error is upper-bounded by a linear combination of the errors from its modules. To further demonstrate the interpretability of NPC, we provide both the most probable explanations and the counterfactual explanations. Empirical results on four benchmark datasets show that NPCs strike a balance between interpretability and performance, achieving results competitive even with those of end-to-end black-box models while providing enhanced interpretability.

machine learning, natural language, prediction, (20 more...)

arXiv.org Artificial Intelligence

2501.07021

Genre: Research Report > New Finding (0.67)

Industry:

Transportation > Infrastructure & Services (0.67)
Transportation > Ground > Road (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Hardware implementation of timely reliable Bayesian decision-making using memristors

Song, Lekai, Liu, Pengyu, Liu, Yang, Pei, Jingfang, Cui, Wenyu, Liu, Songwei, Wen, Yingyi, Ma, Teng, Pun, Kong-Pang, Ng, Leonard W. T., Hu, Guohua

arXiv.org Artificial IntelligenceJan-12-2025

Brains perform decision-making by Bayes theorem. The theorem quantifies events as probabilities and, based on probability rules, renders the decisions. Learning from this, Bayes theorem can be applied to enable efficient user-scene interactions. However, given the probabilistic nature, implementing Bayes theorem in hardware using conventional deterministic computing can incur excessive computational cost and decision latency. Though challenging, here we present a probabilistic computing approach based on memristors to implement the Bayes theorem. We integrate memristors with Boolean logics and, by exploiting the volatile stochastic switching of the memristors, realise probabilistic logic operations, key for hardware Bayes theorem implementation. To empirically validate the efficacy of the hardware Bayes theorem in user-scene interactions, we develop lightweight Bayesian inference and fusion hardware operators using the probabilistic logics and apply the operators in road scene parsing for self-driving, including route planning and obstacle detection. The results show our operators can achieve reliable decisions in less than 0.4 ms (or equivalently 2,500 fps), outperforming human decision-making and the existing driving assistance systems.

artificial intelligence, machine learning, memristor, (17 more...)

arXiv.org Artificial Intelligence

2412.06838

Country: Asia > China > Hong Kong (0.29)

Genre: Research Report > New Finding (0.66)

Industry: Semiconductors & Electronics (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

All AI Models are Wrong, but Some are Optimal

Anand, Akhil S, Sawant, Shambhuraj, Reinhardt, Dirk, Gros, Sebastien

arXiv.org Artificial IntelligenceJan-10-2025

AI models that predict the future behavior of a system (a.k.a. predictive AI models) are central to intelligent decision-making. However, decision-making using predictive AI models often results in suboptimal performance. This is primarily because AI models are typically constructed to best fit the data, and hence to predict the most likely future rather than to enable high-performance decision-making. The hope that such prediction enables high-performance decisions is neither guaranteed in theory nor established in practice. In fact, there is increasing empirical evidence that predictive models must be tailored to decision-making objectives for performance. In this paper, we establish formal (necessary and sufficient) conditions that a predictive model (AI-based or not) must satisfy for a decision-making policy established using that model to be optimal. We then discuss their implications for building predictive AI models for sequential decision-making.

data mining, machine learning, reinforcement learning, (23 more...)

arXiv.org Artificial Intelligence

2501.06086

Country:

Europe > Norway (0.28)
North America > Canada (0.28)

Genre: Research Report (0.40)

Industry:

Energy > Energy Storage (0.46)
Energy > Oil & Gas (0.30)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(4 more...)

Add feedback

Explainable Federated Bayesian Causal Inference and Its Application in Advanced Manufacturing

Xiao, Xiaofeng, Alharbi, Khawlah, Zhang, Pengyu, Qin, Hantang, Yue, Xubo

arXiv.org Artificial IntelligenceJan-10-2025

Causal inference has recently gained notable attention across various fields like biology, healthcare, and environmental science, especially within explainable artificial intelligence (xAI) systems, for uncovering the causal relationships among multiple variables and outcomes. Yet, it has not been fully recognized and deployed in the manufacturing systems. In this paper, we introduce an explainable, scalable, and flexible federated Bayesian learning framework, \texttt{xFBCI}, designed to explore causality through treatment effect estimation in distributed manufacturing systems. By leveraging federated Bayesian learning, we efficiently estimate posterior of local parameters to derive the propensity score for each client without accessing local private data. These scores are then used to estimate the treatment effect using propensity score matching (PSM). Through simulations on various datasets and a real-world Electrohydrodynamic (EHD) printing data, we demonstrate that our approach outperforms standard Bayesian causal inference methods and several state-of-the-art federated learning benchmarks.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.06077

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Distilling Calibration via Conformalized Credal Inference

Huang, Jiayi, Park, Sangwoo, Paoletti, Nicola, Simeone, Osvaldo

arXiv.org Artificial IntelligenceJan-10-2025

Deploying artificial intelligence (AI) models on edge devices involves a delicate balance between meeting stringent complexity constraints, such as limited memory and energy resources, and ensuring reliable performance in sensitive decision-making tasks. One way to enhance reliability is through uncertainty quantification via Bayesian inference. This approach, however, typically necessitates maintaining and running multiple models in an ensemble, which may exceed the computational limits of edge devices. This paper introduces a low-complexity methodology to address this challenge by distilling calibration information from a more complex model. In an offline phase, predictive probabilities generated by a high-complexity cloud-based model are leveraged to determine a threshold based on the typical divergence between the cloud and edge models. At run time, this threshold is used to construct credal sets -- ranges of predictive probabilities that are guaranteed, with a user-selected confidence level, to include the predictions of the cloud model. The credal sets are obtained through thresholding of a divergence measure in the simplex of predictive probabilities. Experiments on visual and language tasks demonstrate that the proposed approach, termed Conformalized Distillation for Credal Inference (CD-CI), significantly improves calibration performance compared to low-complexity Bayesian methods, such as Laplace approximation, making it a practical and efficient solution for edge AI deployments.

artificial intelligence, machine learning, small-scale model, (14 more...)

arXiv.org Artificial Intelligence

2501.06066

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry:

Education (0.46)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback