AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization

Neural Information Processing SystemsOct-10-2025, 00:22:04 GMT

Training machine learning and statistical models often involves optimizing a data-driven risk criterion.

criterion, experiment, procedure, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Arizona (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Stick-Breaking Mixture Normalizing Flows with Component-Wise Tail Adaptation for Variational Inference

Han, Seungsu, Hwang, Juyoung, Chang, Won

arXiv.org Machine LearningOct-10-2025

Normalizing flows with a Gaussian base provide a computationally efficient way to approximate posterior distributions in Bayesian inference, but they often struggle to capture complex posteriors with multimodality and heavy tails. We propose a stick-breaking mixture base with component-wise tail adaptation (StiCTAF) for posterior approximation. The method first learns a flexible mixture base to mitigate the mode-seeking bias of reverse KL divergence through a weighted average of component-wise ELBOs. It then estimates local tail indices of unnormalized densities and finally refines each mixture component using a shared backbone combined with component-specific tail transforms calibrated by the estimated indices. This design enables accurate mode coverage and anisotropic tail modeling while retaining exact density evaluation and stable optimization. Experiments on synthetic posteriors demonstrate improved tail recovery and better coverage of multiple modes compared to benchmark models. We also present a real-data analysis illustrating the practical benefits of our approach for posterior inference.

base distribution, inference, posterior, (15 more...)

arXiv.org Machine Learning

2510.07965

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Tennessee (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Add feedback

Rotated Mean-Field Variational Inference and Iterative Gaussianization

Chen, Yifan, Liu, Sifan

arXiv.org Machine LearningOct-10-2025

We propose to perform mean-field variational inference (MFVI) in a rotated coordinate system that reduces correlations between variables. The rotation is determined by principal component analysis (PCA) of a cross-covariance matrix involving the target's score function. Compared with standard MFVI along the original axes, MFVI in this rotated system often yields substantially more accurate approximations with negligible additional cost. MFVI in a rotated coordinate system defines a rotation and a coordinatewise map that together move the target closer to Gaussian. Iterating this procedure yields a sequence of transformations that progressively transforms the target toward Gaussian. The resulting algorithm provides a computationally efficient way to construct flow-like transport maps: it requires only MFVI subproblems, avoids large-scale optimization, and yields transformations that are easy to invert and evaluate. In Bayesian inference tasks, we demonstrate that the proposed method achieves higher accuracy than standard MFVI, while maintaining much lower computational cost than conventional normalizing flows.

approximation, mfvi, rotation, (15 more...)

arXiv.org Machine Learning

2510.07732

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Add feedback

From Data to Rewards: a Bilevel Optimization Perspective on Maximum Likelihood Estimation

Benechehab, Abdelhakim, Singer, Gabriel, Léger, Corentin, Hili, Youssef Attia El, Paolo, Giuseppe, Thomas, Albert, Filippone, Maurizio, Kégl, Balázs

arXiv.org Machine LearningOct-10-2025

Generative models form the backbone of modern machine learning, underpinning state-of-the-art systems in text, vision, and multimodal applications. While Maximum Likelihood Estimation has traditionally served as the dominant training paradigm, recent work have highlighted its limitations, particularly in generalization and susceptibility to catastrophic forgetting compared to Reinforcement Learning techniques, such as Policy Gradient methods. However, these approaches depend on explicit reward signals, which are often unavailable in practice, leaving open the fundamental problem of how to align generative models when only high-quality datasets are accessible. In this work, we address this challenge via a Bilevel Optimization framework, where the reward function is treated as the optimization variable of an outer-level problem, while a policy gradient objective defines the inner-level. We then conduct a theoretical analysis of this optimization problem in a tractable setting and extract insights that, as we demonstrate, generalize to applications such as tabular classification and model-based reinforcement learning. We release the code at https://github.com/abenechehab/nll_to_po .

arxiv, international conference, reward function, (11 more...)

arXiv.org Machine Learning

2510.07624

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Canary Islands (0.04)
(2 more...)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

metabeta - A fast neural model for Bayesian mixed-effects regression

Kipnis, Alex, Binz, Marcel, Schulz, Eric

arXiv.org Machine LearningOct-10-2025

Hierarchical data with multiple observations per group is ubiquitous in empirical sciences and is often analyzed using mixed-effects regression. In such models, Bayesian inference gives an estimate of uncertainty but is analytically intractable and requires costly approximation using Markov Chain Monte Carlo (MCMC) methods. Neural posterior estimation shifts the bulk of computation from inference time to pre-training time, amortizing over simulated datasets with known ground truth targets. We propose metabeta, a transformer-based neural network model for Bayesian mixed-effects regression. Using simulated and real data, we show that it reaches stable and comparable performance to MCMC-based parameter estimation at a fraction of the usually required time.

dataset, fast neural model, posterior, (14 more...)

arXiv.org Machine Learning

2510.07473

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models

Orth, Jasmin, Mondorf, Philipp, Plank, Barbara

arXiv.org Artificial IntelligenceOct-10-2025

Conditional acceptability refers to how plausible a conditional statement is perceived to be. It plays an important role in communication and reasoning, as it influences how individuals interpret implications, assess arguments, and make decisions based on hypothetical scenarios. When humans evaluate how acceptable a conditional "If A, then B" is, their judgments are influenced by two main factors: the $\textit{conditional probability}$ of $B$ given $A$, and the $\textit{semantic relevance}$ of the antecedent $A$ given the consequent $B$ (i.e., whether $A$ meaningfully supports $B$). While prior work has examined how large language models (LLMs) draw inferences about conditional statements, it remains unclear how these models judge the $\textit{acceptability}$ of such statements. To address this gap, we present a comprehensive study of LLMs' conditional acceptability judgments across different model families, sizes, and prompting strategies. Using linear mixed-effects models and ANOVA tests, we find that models are sensitive to both conditional probability and semantic relevance-though to varying degrees depending on architecture and prompting style. A comparison with human data reveals that while LLMs incorporate probabilistic and semantic cues, they do so less consistently than humans. Notably, larger models do not necessarily align more closely with human judgments.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.08388

Country: Europe > United Kingdom (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.74)

Add feedback

Bayesian Decision Making around Experts

Ornia, Daniel Jarne, Dyer, Joel, Bishop, Nicholas, Calinescu, Anisoara, Wooldridge, Michael

arXiv.org Artificial IntelligenceOct-10-2025

Complex learning agents are increasingly deployed alongside existing experts, such as human operators or previously trained agents. However, it remains unclear how should learners optimally incorporate certain forms of expert data, which may differ in structure from the learner's own action-outcome experiences. We study this problem in the context of Bayesian multi-armed bandits, considering: (i) offline settings, where the learner receives a dataset of outcomes from the expert's optimal policy before interaction, and (ii) simultaneous settings, where the learner must choose at each step whether to update its beliefs based on its own experience, or based on the outcome simultaneously achieved by an expert. We formalize how expert data influences the learner's posterior, and prove that pretraining on expert outcomes tightens information-theoretic regret bounds by the mutual information between the expert data and the optimal action. For the simultaneous setting, we propose an information-directed rule where the learner processes the data source that maximizes their one-step information gain about the optimal action. Finally, we propose strategies for how the learner can infer when to trust the expert and when not to, safeguarding the learner for the cases where the expert is ineffective or compromised. By quantifying the value of expert data, our framework provides practical, information-theoretic algorithms for agents to intelligently decide when to learn from others.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.08113

Country: North America > United States (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Unsupervised Radio Map Construction in Mixed LoS/NLoS Indoor Environments

Xing, Zheng, Chen, Junting

arXiv.org Artificial IntelligenceOct-10-2025

Radio maps are essential for enhancing wireless communications and localization. However, existing methods for constructing radio maps typically require costly calibration processes to collect location-labeled channel state information (CSI) datasets. This paper aims to recover the data collection trajectory directly from the channel propagation sequence, eliminating the need for location calibration. The key idea is to employ a hidden Markov model (HMM)-based framework to conditionally model the channel propagation matrix, while simultaneously modeling the location correlation in the trajectory. The primary challenges involve modeling the complex relationship between channel propagation in multiple-input multiple-output (MIMO) networks and geographical locations, and addressing both line-of-sight (LOS) and non-line-of-sight (NLOS) indoor conditions. In this paper, we propose an HMM-based framework that jointly characterizes the conditional propagation model and the evolution of the user trajectory. Specifically, the channel propagation in MIMO networks is modeled separately in terms of power, delay, and angle, with distinct models for LOS and NLOS conditions. The user trajectory is modeled using a Gaussian-Markov model. The parameters for channel propagation, the mobility model, and LOS/NLOS classification are optimized simultaneously. Experimental validation using simulated MIMO-Orthogonal Frequency-Division Multiplexing (OFDM) networks with a multi-antenna uniform linear arrays (ULA) configuration demonstrates that the proposed method achieves an average localization accuracy of 0.65 meters in an indoor environment, covering both LOS and NLOS regions. Moreover, the constructed radio map enables localization with a reduced error compared to conventional supervised methods, such as k-nearest neighbors (KNN), support vector machine (SVM), and deep neural network (DNN).

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2510.08015

Country: Asia > China > Guangdong Province (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

PRESCRIBE: Predicting Single-Cell Responses with Bayesian Estimation

Cheng, Jiabei, Chi, Changxi, Zhou, Jingbo, Xin, Hongyi, Xia, Jun

arXiv.org Artificial IntelligenceOct-10-2025

In single-cell perturbation prediction, a central task is to forecast the effects of perturbing a gene unseen in the training data. The efficacy of such predictions depends on two factors: (1) the similarity of the target gene to those covered in the training data, which informs model (epistemic) uncertainty, and (2) the quality of the corresponding training data, which reflects data (aleatoric) uncertainty. Both factors are critical for determining the reliability of a prediction, particularly as gene perturbation is an inherently stochastic biochemical process. In this paper, we propose PRESCRIBE (PREdicting Single-Cell Response wIth Bayesian Estimation), a multivariate deep evidential regression framework designed to measure both sources of uncertainty jointly. Our analysis demonstrates that PRESCRIBE effectively estimates a confidence score for each prediction, which strongly correlates with its empirical accuracy. This capability enables the filtering of untrustworthy results, and in our experiments, it achieves steady accuracy improvements of over 3% compared to comparable baselines.

artificial intelligence, machine learning, prediction, (21 more...)

arXiv.org Artificial Intelligence

2510.07964

Country: Asia > China (0.46)

Genre: