AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Variational Gaussian Process State-Space Models

Roger Frigola, Yutian Chen, Carl Edward Rasmussen

Neural Information Processing SystemsFeb-8-2025, 18:08:27 GMT

State-space models have been successfully used for more than fifty years in different areas of science and engineering. We present a procedure for efficient variational Bayesian learning of nonlinear state-space models based on sparse Gaussian processes. The result of learning is a tractable posterior over nonlinear dynamical systems. In comparison to conventional parametric models, we offer the possibility to straightforwardly trade off model capacity and computational cost whilst avoiding overfitting. Our main algorithm uses a hybrid inference approach combining variational Bayes and sequential Monte Carlo.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Italy > Sardinia (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Diverse Sequential Subset Selection for Supervised Video Summarization

Boqing Gong, Wei-Lun Chao, Kristen Grauman, Fei Sha

Neural Information Processing SystemsFeb-8-2025, 17:13:16 GMT

Video summarization is a challenging problem with great application potential. Whereas prior approaches, largely unsupervised in nature, focus on sampling useful frames and assembling them as summaries, we consider video summarization as a supervised subset selection problem. Our idea is to teach the system to learn from human-created summaries how to select informative and diverse subsets, so as to best meet evaluation metrics derived from human-perceived quality. To this end, we propose the sequential determinantal point process (seqDPP), a probabilistic model for diverse sequential subset selection. Our novel seqDPP heeds the inherent sequential structures in video data, thus overcoming the deficiency of the standard DPP, which treats video frames as randomly permutable items. Meanwhile, seqDPP retains the power of modeling diverse subsets, essential for summarization. Our extensive results of summarizing videos from 3 datasets demonstrate the superior performance of our method, compared to not only existing unsupervised methods but also naive applications of the standard DPP model.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > Texas > Travis County > Austin (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Compressive Sensing of Signals from a GMM with Sparse Precision Matrices

Jianbo Yang, Xuejun Liao, Minhua Chen, Lawrence Carin

Neural Information Processing SystemsFeb-8-2025, 15:56:42 GMT

This paper is concerned with compressive sensing of signals drawn from a Gaussian mixture model (GMM) with sparse precision matrices. Previous work has shown: (i) a signal drawn from a given GMM can be perfectly reconstructed from r noise-free measurements if the (dominant) rank of each covariance matrix is less than r; (ii) a sparse Gaussian graphical model can be efficiently estimated from fully-observed training signals using graphical lasso. This paper addresses a problem more challenging than both (i) and (ii), by assuming that the GMM is unknown and each signal is only observed through incomplete linear measurements. Under these challenging assumptions, we develop a hierarchical Bayesian method to simultaneously estimate the GMM and recover the signals using solely the incomplete measurements and a Bayesian shrinkage prior that promotes sparsity of the Gaussian precision matrices. In addition, we provide theoretical performance bounds to relate the reconstruction error to the number of signals for which measurements are available, the sparsity level of precision matrices, and the "incompleteness" of measurements. The proposed method is demonstrated extensively on compressive sensing of imagery and video, and the results with simulated and hardware-acquired real measurements show significant performance improvement over state-of-the-art methods.

matrix, precision matrix, sparse-gmm, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights

Daniel Soudry, Itay Hubara, Ron Meir

Neural Information Processing SystemsFeb-8-2025, 15:55:06 GMT

Multilayer Neural Networks (MNNs) are commonly trained using gradient descent-based methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a "mean-field" factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is a major challenge.

artificial intelligence, machine learning, mnn, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.81)

Add feedback

Predictive Entropy Search for Efficient Global Optimization of Black-box Functions

Neural Information Processing SystemsFeb-8-2025, 15:33:06 GMT

We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES). At each iteration, PES selects the next evaluation point that maximizes the expected information gained with respect to the global maximum. PES codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution. This reformulation allows PES to obtain approximations that are both more accurate and efficient than other alternatives such as Entropy Search (ES). Furthermore, PES can easily perform a fully Bayesian treatment of the model hyperparameters while ES cannot. We evaluate PES in both synthetic and realworld applications, including optimization problems in machine learning, finance, biotechnology, and robotics. We show that the increased accuracy of PES leads to significant gains in optimization performance.

approximation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Transportation > Air (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Review for NeurIPS paper: Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment

Neural Information Processing SystemsFeb-8-2025, 02:33:19 GMT

Weaknesses: - Central parts of the paper are unclear eg. in line 80 \log P_M (X; \theta) should be the negative cross entropy. The only quantitative results are on adaptation from USPS to MNIST in line 268. However, prior work [1] achieves 96.5% accuracy in comparison to the 55% accuracy achieved by the proposed method. It would be desirable to evaluate the proposed approach on the more complex Facades/Maps/Cityscapes using the MSE metric to facilitate comparison with AlignFlow and [1]. It is unclear how the inductive bias from each of the datasets influence the shared space.

log-likelihood ratio minimizing flow, neurips paper, quantifiable neural distribution alignment, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

Review for NeurIPS paper: Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment

Neural Information Processing SystemsFeb-8-2025, 02:33:12 GMT

After discussion, all reviewers, and the meta-reviewer, agree that the paper should be accepted. As the authors show, the method in its current form may not scale well to higher dimensions. While a method without this limitation would obviously be preferable, the reviewers agree that this limitation can be addressed in future work, where the connection with GANs that the authors establish may be helpful.

log-likelihood ratio minimizing flow, neurips paper, quantifiable neural distribution alignment, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

PIPA: Preference Alignment as Prior-Informed Statistical Estimation

Li, Junbo, Wang, Zhangyang, Liu, Qiang

arXiv.org Machine LearningFeb-8-2025

Offline preference alignment for language models such as Direct Preference Optimization (DPO) is favored for its effectiveness and simplicity, eliminating the need for costly reinforcement learning. Various offline algorithms have been developed for different data settings, yet they lack a unified understanding. In this study, we introduce Pior-Informed Preference Alignment (PIPA), a unified, RL-free probabilistic framework that formulates language model preference alignment as a Maximum Likelihood Estimation (MLE) problem with prior constraints. This method effectively accommodates both paired and unpaired data, as well as answer and step-level annotations. We illustrate that DPO and KTO are special cases with different prior constraints within our framework. By integrating different types of prior information, we developed two variations of PIPA: PIPA-M and PIPA-N. Both algorithms demonstrate a $3\sim10\%$ performance enhancement on the GSM8K and MATH benchmarks across all configurations, achieving these gains without additional training or computational costs compared to existing algorithms.

machine learning, natural language, preprint arxiv, (18 more...)

arXiv.org Machine Learning

2502.05773

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Add feedback

Enhancing Hallucination Detection through Noise Injection

Liu, Litian, Pourreza, Reza, Panchal, Sunny, Bhattacharyya, Apratim, Qin, Yao, Memisevic, Roland

arXiv.org Artificial IntelligenceFeb-8-2025

Large Language Models (LLMs) are prone to generating plausible yet incorrect responses, known as hallucinations. Effectively detecting hallucinations is therefore crucial for the safe deployment of LLMs. Recent research has linked hallucinations to model uncertainty, suggesting that hallucinations can be detected by measuring dispersion over answer distributions obtained from a set of samples drawn from a model. While drawing from the distribution over tokens defined by the model is a natural way to obtain samples, in this work, we argue that it is sub-optimal for the purpose of detecting hallucinations. We show that detection can be improved significantly by taking into account model uncertainty in the Bayesian sense. To this end, we propose a very simple and efficient approach that perturbs an appropriate subset of model parameters, or equivalently hidden unit activations, during sampling. We demonstrate its effectiveness across a wide range of datasets and model architectures.

hallucination detection, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.03799

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Mexico > Mexico City > Mexico City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

dynoGP: Deep Gaussian Processes for dynamic system identification

Benavoli, Alessio, Piga, Dario, Forgione, Marco, Zaffalon, Marco

arXiv.org Machine LearningFeb-8-2025

In this work, we present a novel approach to system identification for dynamical systems, based on a specific class of Deep Gaussian Processes (Deep GPs). These models are constructed by interconnecting linear dynamic GPs (equivalent to stochastic linear time-invariant dynamical systems) and static GPs (to model static nonlinearities). Our approach combines the strengths of data-driven methods, such as those based on neural network architectures, with the ability to output a probability distribution. This offers a more comprehensive framework for system identification that includes uncertainty quantification. Using both simulated and real-world data, we demonstrate the effectiveness of the proposed approach.

artificial intelligence, identification, machine learning, (21 more...)

arXiv.org Machine Learning

2502.0562

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
Europe > Sweden > Uppsala County > Uppsala (0.04)
Oceania > Australia > Victoria (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback