AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

A Provable Approach for End-to-End Safe Reinforcement Learning

Wachi, Akifumi, Miyaguchi, Kohei, Tanabe, Takumi, Sato, Rei, Akimoto, Youhei

arXiv.org Artificial IntelligenceMay-29-2025

A longstanding goal in safe reinforcement learning (RL) is a method to ensure the safety of a policy throughout the entire process, from learning to operation. However, existing safe RL paradigms inherently struggle to achieve this objective. We propose a method, called Provably Lifetime Safe RL (PLS), that integrates offline safe RL with safe policy deployment to address this challenge. Our proposed method learns a policy offline using return-conditioned supervised learning and then deploys the resulting policy while cautiously optimizing a limited set of parameters, known as target returns, using Gaussian processes (GPs). Theoretically, we justify the use of GPs by analyzing the mathematical relationship between target and actual returns. We then prove that PLS finds near-optimal target returns while guaranteeing safety with high probability. Empirically, we demonstrate that PLS outperforms baselines both in safety and reward performance, thereby achieving the longstanding goal to obtain high rewards while ensuring the safety of a policy throughout the lifetime from learning to operation.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2505.21852

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model

Bhattacharya, Sagnik, Gorle, Abhiram, Bilal, Ahsan, Ding, Connor, Yadav, Amit Kumar Singh, Weissman, Tsachy

arXiv.org Artificial IntelligenceMay-29-2025

Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. Firstly, many approaches rely on modeling continuous embeddings, which is suboptimal for inherently discrete data distributions. Secondly, most models optimize variational bounds rather than exact data likelihood, resulting in inaccurate likelihood estimates and degraded sampling quality. While recent diffusion-based models have addressed these issues separately, we tackle them jointly. In this work, we introduce the Information-Theoretic Discrete Poisson Diffusion Model (ItDPDM), inspired by photon arrival process, which combines exact likelihood estimation with fully discrete-state modeling. Central to our approach is an information-theoretic Poisson Reconstruction Loss (PRL) that has a provable exact relationship with the true data likelihood. ItDPDM achieves improved likelihood and sampling performance over prior discrete and continuous diffusion models on a variety of synthetic discrete datasets. Furthermore, on real-world datasets such as symbolic music and images, ItDPDM attains superior likelihood estimates and competitive generation quality-demonstrating a proof of concept for distribution-robust discrete generative modeling.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.05082

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.67)
Media > Music (0.46)
Transportation > Ground > Road (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Federated Instrumental Variable Analysis via Federated Generalized Method of Moments

Geetika, null, Tyagi, Somya, Chatterjee, Bapi

arXiv.org Machine LearningMay-28-2025

Instrumental variables (IV) analysis is an important applied tool for areas such as healthcare and consumer economics. For IV analysis in high-dimensional settings, the Generalized Method of Moments (GMM) using deep neural networks offers an efficient approach. With non-i.i.d. data sourced from scattered decentralized clients, federated learning is a popular paradigm for training the models while promising data privacy. However, to our knowledge, no federated algorithm for either GMM or IV analysis exists to date. In this work, we introduce federated instrumental variables analysis (FedIV) via federated generalized method of moments (FedGMM). We formulate FedGMM as a federated zero-sum game defined by a federated non-convex non-concave minimax optimization problem, which is solved using federated gradient descent ascent (FedGDA) algorithm. One key challenge arises in theoretically characterizing the federated local optimality. To address this, we present properties and existence results of clients' local equilibria via FedGDA limit points. Thereby, we show that the federated solution consistently estimates the local moment conditions of every participating client. The proposed algorithm is backed by extensive experiments to demonstrate the efficacy of our approach.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2505.21012

Country:

Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.67)
Personal > Honors (0.46)
Research Report > Strength High (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Information Technology > Security & Privacy (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Causal Posterior Estimation

Dirmeier, Simon, Mira, Antonietta

arXiv.org Machine LearningMay-28-2025

We present Causal Posterior Estimation (CPE), a novel method for Bayesian inference in simulator models, i.e., models where the evaluation of the likelihood function is intractable or too computationally expensive, but where one can simulate model outputs given parameter values. CPE utilizes a normalizing flow-based (NF) approximation to the posterior distribution which carefully incorporates the conditional dependence structure induced by the graphical representation of the model into the neural network. Thereby it is possible to improve the accuracy of the approximation. We introduce both discrete and continuous NF architectures for CPE and propose a constant-time sampling procedure for the continuous case which reduces the computational complexity of drawing samples to O(1) as for discrete NFs. We show, through an extensive experimental evaluation, that by incorporating the conditional dependencies induced by the graphical model directly into the neural network, rather than learning them from data, CPE is able to conduct highly accurate posterior inference either outperforming or matching the state of the art in the field.

artificial intelligence, international conference, machine learning, (15 more...)

arXiv.org Machine Learning

2505.21468

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Italy (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Integral Imprecise Probability Metrics

Chau, Siu Lun, Caprio, Michele, Muandet, Krikamol

arXiv.org Machine LearningMay-28-2025

Quantifying differences between probability distributions is fundamental to statistics and machine learning, primarily for comparing statistical uncertainty. In contrast, epistemic uncertainty (EU) -- due to incomplete knowledge -- requires richer representations than those offered by classical probability. Imprecise probability (IP) theory offers such models, capturing ambiguity and partial belief. This has driven growing interest in imprecise probabilistic machine learning (IPML), where inference and decision-making rely on broader uncertainty models -- highlighting the need for metrics beyond classical probability. This work introduces the Integral Imprecise Probability Metric (IIPM) framework, a Choquet integral-based generalisation of classical Integral Probability Metric (IPM) to the setting of capacities -- a broad class of IP models encompassing many existing ones, including lower probabilities, probability intervals, belief functions, and more. Theoretically, we establish conditions under which IIPM serves as a valid metric and metrises a form of weak convergence of capacities. Practically, IIPM not only enables comparison across different IP models but also supports the quantification of epistemic uncertainty within a single IP model. In particular, by comparing an IP model with its conjugate, IIPM gives rise to a new class of EU measures -- Maximum Mean Imprecision -- which satisfy key axiomatic properties proposed in the Uncertainty Quantification literature. We validate MMI through selective classification experiments, demonstrating strong empirical performance against established EU measures, and outperforming them when classical methods struggle to scale to a large number of classes. Our work advances both theory and practice in IPML, offering a principled framework for comparing and quantifying epistemic uncertainty under imprecision.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2505.16156

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(8 more...)

Genre:

Research Report (0.82)
Overview (0.67)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data

Wang, Linshanshan, Li, Mengyan, Xia, Zongqi, Liu, Molei, Cai, Tianxi

arXiv.org Artificial IntelligenceMay-28-2025

Electronic Health Records (EHR) offer rich real-world data for personalized medicine, providing insights into disease progression, treatment responses, and patient outcomes. However, their sparsity, heterogeneity, and high dimensionality make them difficult to model, while the lack of standardized ground truth further complicates predictive modeling. To address these challenges, we propose SCORE, a semi-supervised representation learning framework that captures multi-domain disease profiles through patient embeddings. SCORE employs a Poisson-Adapted Latent factor Mixture (PALM) Model with pre-trained code embeddings to characterize codified features and extract meaningful patient phenotypes and embeddings. To handle the computational challenges of large-scale data, it introduces a hybrid Expectation-Maximization (EM) and Gaussian Variational Approximation (GVA) algorithm, leveraging limited labeled data to refine estimates on a vast pool of unlabeled samples. We theoretically establish the convergence of this hybrid approach, quantify GVA errors, and derive SCORE's error rate under diverging embedding dimensions. Our analysis shows that incorporating unlabeled data enhances accuracy and reduces sensitivity to label scarcity. Extensive simulations confirm SCORE's superior finite-sample performance over existing methods. Finally, we apply SCORE to predict disability status for patients with multiple sclerosis (MS) using partially labeled EHR data, demonstrating that it produces more informative and predictive patient embeddings for multiple MS-related conditions compared to existing approaches.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.20731

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.69)
Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Joint Magnetometer-IMU Calibration via Maximum A Posteriori Estimation

Huang, Chuan, Hendeby, Gustaf, Skog, Isaac

arXiv.org Artificial IntelligenceMay-28-2025

This paper presents a new approach for jointly calibrating magnetometers and inertial measurement units, focusing on improving calibration accuracy and computational efficiency. The proposed method formulates the calibration problem as a maximum a posteriori estimation problem, treating both the calibration parameters and orientation trajectory of the sensors as unknowns. This formulation enables efficient optimization with closed-form derivatives. The method is compared against two state-of-the-art approaches in terms of computational complexity and estimation accuracy. Simulation results demonstrate that the proposed method achieves lower root mean square error in calibration parameters while maintaining competitive computational efficiency. Further validation through real-world experiments confirms the practical benefits of our approach: it effectively reduces position drift in a magnetic field-aided inertial navigation system by more than a factor of two on most datasets. Moreover, the proposed method calibrated 30 magnetometers in less than 2 minutes. The contributions include a new calibration method, an analysis of existing methods, and a comprehensive empirical evaluation. Datasets and algorithms are made publicly available to promote reproducible research.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.16662

Country: Europe (0.68)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Finite-Sample Maximum Likelihood Estimation of Location

Neural Information Processing SystemsMay-27-2025, 21:35:37 GMT

We consider 1-dimensional location estimation, where we estimate a parameter \lambda from n samples \lambda \eta_i, with each \eta_i drawn i.i.d. For fixed f the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as n \to \infty: it is asymptotically normal with variance matching the Cramer-Rao lower bound of \frac{1}{n\mathcal{I}}, where \mathcal{I} is the Fisher information of f . However, this bound does not hold for finite n, or when f varies with n . We show for arbitrary f and n that one can recover a similar theory based on the Fisher information of a smoothed version of f, where the smoothing radius decays with n .

artificial intelligence, bayesian inference, machine learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Unrolled denoising networks provably learn to perform optimal Bayesian inference

Neural Information Processing SystemsMay-27-2025, 21:17:50 GMT

Much of Bayesian inference centers around the design of estimators for inverse problems which are optimal assuming the data comes from a known prior. But what do these optimality guarantees mean if the prior is unknown? In recent years, algorithm unrolling has emerged as deep learning's answer to this age-old question: design a neural network whose layers can in principle simulate iterations of inference algorithms and train on data generated by the unknown prior. Despite its empirical success, however, it has remained unclear whether this method can provably recover the performance of its optimal, prior-aware counterparts.In this work, we prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP). For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network approximately converge to the same denoisers used in Bayes AMP. We also provide extensive numerical experiments for compressed sensing and rank-one matrix estimation demonstrating the advantages of our unrolled architecture --- in addition to being able to obliviously adapt to general priors, it exhibits improvements over Bayes AMP in more general settings of low dimensions, non-Gaussian designs, and non-product priors.

artificial intelligence, machine learning, perform optimal bayesian inference, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

Entropy testing and its application to testing Bayesian networks

Neural Information Processing SystemsMay-27-2025, 20:57:03 GMT

This paper studies the problem of \emph{entropy identity testing}: given sample access to a distribution p and a fully described distribution q (both are discrete distributions over the support of size k), and the promise that either p q or H(p) - H(q) \geqslant \varepsilon, where H(\cdot) denotes the Shannon entropy, a tester needs to distinguish between the two cases with high probability. This improves on the sample complexity bound of \tilde{O}(2 {d/2}n 2/\varepsilon 4) from Canonne, Diakonikolas, Kane, and Stewart (2020), which required an additional assumption on the structure of the (unknown) Bayesian network.

bayesian network, testing bayesian network, varepsilon, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Add feedback