AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling

Dang, Meihua, Han, Jiaqi, Xu, Minkai, Xu, Kai, Srivastava, Akash, Ermon, Stefano

arXiv.org Artificial IntelligenceOct-7-2025

Discrete diffusion models have recently emerged as strong alternatives to autoregressive language models, matching their performance through large-scale training. However, inference-time control remains relatively underexplored. In this work, we study how to steer generation toward desired rewards without retraining the models. Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement. We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity under reward optimization. PG-DLM constructs a Markov chain over full denoising trajectories and applies a conditional sequential Monte Carlo kernel to resample them. We derive theoretical guarantees for convergence, including asymptotic consistency and variance bounds. Within this framework, we further analyze trade-offs across four key axes for inference-time scaling under fixed budgets: iterations, samples, denoising steps, and reward estimation. Our analysis shows scaling iterations achieves the best reward-perplexity trade-off. Empirically, PG-DLM consistently outperforms prior methods using MDLM and LLaDA-8B as base models across a wide range of compute budgets for reward-guided generation tasks including toxicity and sentiment control as well as linguistic acceptability.

diffusion model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2507.0839

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Add feedback

A fast non-reversible sampler for Bayesian finite mixture models

Ascolani, Filippo, Zanella, Giacomo

arXiv.org Machine LearningOct-6-2025

Finite mixtures are a cornerstone of Bayesian modelling, and it is well-known that sampling from the resulting posterior distribution can be a hard task. In particular, popular reversible Markov chain Monte Carlo schemes are often slow to converge when the number of observations $n$ is large. In this paper we introduce a novel and simple non-reversible sampling scheme for Bayesian finite mixture models, which is shown to drastically outperform classical samplers in many scenarios of interest, especially during convergence phase and when components in the mixture have non-negligible overlap. At the theoretical level, we show that the performance of the proposed non-reversible scheme cannot be worse than the standard one, in terms of asymptotic variance, by more than a factor of four; and we provide a scaling limit analysis suggesting that the non-reversible sampler can reduce the convergence time from O$(n^2)$ to O$(n)$. We also discuss why the statistical features of mixture models make them an ideal case for the use of non-reversible discrete samplers.

iteration, mixture model, sampler, (14 more...)

arXiv.org Machine Learning

2510.03226

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Total Robustness in Bayesian Nonlinear Regression for Measurement Error Problems under Model Misspecification

Chen, Mengqi, Dellaporta, Charita, Berrett, Thomas B., Damoulas, Theodoros

arXiv.org Machine LearningOct-6-2025

Modern regression analyses are often undermined by covariate measurement error, misspecification of the regression model, and misspecification of the measurement error distribution. We present, to the best of our knowledge, the first Bayesian nonparametric framework targeting total robustness that tackles all three challenges in general nonlinear regression. The framework assigns a Dirichlet process prior to the latent co-variate-response distribution and updates it with posterior pseudo-samples of the latent covariates, thereby providing the Dirichlet process posterior with observation-informed latent inputs and yielding estimators that minimise the discrepancy between Dirichlet process realisations and the model-induced joint law. This design allows practitioners to (i) encode prior beliefs, (ii) choose between pseudo-sampling latent covariates or working directly with error-prone observations, and (iii) tune the influence of prior and data. We establish generalisation bounds that tighten whenever the prior or pseudo-sample generator aligns with the underlying data generating process, ensuring robustness without sacrificing consistency. A gradient-based algorithm enables efficient computations; simulations and two real-world studies show lower estimation error and reduced estimation sensitivity to misspecification compared to Bayesian and frequentist competitors. The framework, therefore, offers a practical and interpretable paradigm for trustworthy regression when data and models are jointly imperfect.

base xy, misspecification, pp base xy, (15 more...)

arXiv.org Machine Learning

2510.03131

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > Experimental Study (0.87)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Oracle-based Uniform Sampling from Convex Bodies

Dang, Thanh, Liang, Jiaming

arXiv.org Machine LearningOct-6-2025

We propose new Markov chain Monte Carlo algorithms to sample a uniform distribution on a convex body $K$. Our algorithms are based on the Alternating Sampling Framework/proximal sampler, which uses Gibbs sampling on an augmented distribution and assumes access to the so-called restricted Gaussian oracle (RGO). The key contribution of this work is the efficient implementation of RGO for uniform sampling on $K$ via rejection sampling and access to either a projection oracle or a separation oracle on $K$. In both oracle cases, we establish non-asymptotic complexities to obtain unbiased samples where the accuracy is measured in Rényi divergence or $χ^2$-divergence.

algorithm, oracle, rejection, (16 more...)

arXiv.org Machine Learning

2510.02983

Country: North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Injecting Measurement Information Yields a Fast and Noise-Robust Diffusion-Based Inverse Problem Solver

Patsenker, Jonathan, Li, Henry, Ko, Myeongseob, Jia, Ruoxi, Kluger, Yuval

arXiv.org Artificial IntelligenceOct-6-2025

Diffusion models have been firmly established as principled zero-shot solvers for linear and nonlinear inverse problems, owing to their powerful image prior and iterative sampling algorithm. These approaches often rely on Tweedie's formula, which relates the diffusion variate $\mathbf{x}_t$ to the posterior mean $\mathbb{E} [\mathbf{x}_0 | \mathbf{x}_t]$, in order to guide the diffusion trajectory with an estimate of the final denoised sample $\mathbf{x}_0$. However, this does not consider information from the measurement $\mathbf{y}$, which must then be integrated downstream. In this work, we propose to estimate the conditional posterior mean $\mathbb{E} [\mathbf{x}_0 | \mathbf{x}_t, \mathbf{y}]$, which can be formulated as the solution to a lightweight, single-parameter maximum likelihood estimation problem. The resulting prediction can be integrated into any standard sampler, resulting in a fast and memory-efficient inverse solver. Our optimizer is amenable to a noise-aware likelihood-based stopping criteria that is robust to measurement noise in $\mathbf{y}$. We demonstrate comparable or improved performance against a wide selection of contemporary inverse solvers across multiple datasets and tasks.

artificial intelligence, inverse problem, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.02964

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bayesian E(3)-Equivariant Interatomic Potential with Iterative Restratification of Many-body Message Passing

Willow, Soohaeng Yoo, Park, Tae Hyeon, Sim, Gi Beom, Moon, Sung Wook, Min, Seung Kyu, Yang, D. ChangMo, Kim, Hyun Woo, Lee, Juho, Myung, Chang Woo

arXiv.org Artificial IntelligenceOct-6-2025

Machine learning potentials (MLPs) have become essential for large-scale atomistic simulations, enabling ab initio-level accuracy with computational efficiency. However, current MLPs struggle with uncertainty quantification, limiting their reliability for active learning, calibration, and out-of-distribution (OOD) detection. We address these challenges by developing Bayesian E(3) equivariant MLPs with iterative restratification of many-body message passing. Our approach introduces the joint energy-force negative log-likelihood (NLL$_\text{JEF}$) loss function, which explicitly models uncertainty in both energies and interatomic forces, yielding superior accuracy compared to conventional NLL losses. We systematically benchmark multiple Bayesian approaches, including deep ensembles with mean-variance estimation, stochastic weight averaging Gaussian, improved variational online Newton, and laplace approximation by evaluating their performance on uncertainty prediction, OOD detection, calibration, and active learning tasks. We further demonstrate that NLL$_\text{JEF}$ facilitates efficient active learning by quantifying energy and force uncertainties. Using Bayesian active learning by disagreement (BALD), our framework outperforms random sampling and energy-uncertainty-based sampling. Our results demonstrate that Bayesian MLPs achieve competitive accuracy with state-of-the-art models while enabling uncertainty-guided active learning, OOD detection, and energy/forces calibration. This work establishes Bayesian equivariant neural networks as a powerful framework for developing uncertainty-aware MLPs for atomistic simulations at scale.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.03046

Country:

Asia > South Korea (0.46)
North America > United States (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Scalable Deep Generative Relational Model with High-Order Node Dependence

Xuhui Fan, Bin Li, Caoyuan Li, Scott SIsson, Ling Chen

Neural Information Processing SystemsOct-3-2025, 09:37:21 GMT

We propose a probabilistic framework for modelling and exploring the latent structure of relational data. Given feature information for the nodes in a network, the scalable deep generative relational model (SDREM) builds a deep network architecture that can approximate potential nonlinear mappings between nodes' feature information and the nodes' latent representations. Our contribution is two-fold: (1) We incorporate high-order neighbourhood structure information to generate the latent representations at each node, which vary smoothly over the network.

deep network architecture, relational data, sdrem, (12 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Oceania > Australia > New South Wales (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Databases (0.96)
Information Technology > Communications > Networks (0.91)
(3 more...)

Add feedback

a33f5792b2a9a51ddd0111b3ac6e0e76-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 09:37:08 GMT

artificial intelligence, machine learning, reviewer 1, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

3948ead63a9f2944218de038d8934305-Reviews.html

Neural Information Processing SystemsOct-3-2025, 09:17:46 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The bottom line of this paper is an efficient algorithm for finding maximum likelihood estimators for elliptically contoured distributions, a class of densities that includes the Gaussian and various generalizations of it. For the Gaussian itself, that optimization is straightforward, it's the generalizations where the new algorithm provides real advantages. One could argue that this focus on a relatively arcane family of distributions (Kotz-type) limits the utility of this paper. But I think it's actually the other way round: The paper may spark new interest at NIPS in these models.

algorithm, elliptically contoured distribution, matrix, (12 more...)

Neural Information Processing Systems

Country: North America > United States > Nevada (0.04)

Genre: Research Report (0.47)

Technology: