AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Variational Bayes Portfolio Construction

Nguyen, Nicolas, Ridgway, James, Vernade, Claire

arXiv.org Machine LearningNov-9-2024

Portfolio construction is the science of balancing reward and risk; it is at the core of modern finance. In this paper, we tackle the question of optimal decision-making within a Bayesian paradigm, starting from a decision-theoretic formulation. Despite the inherent intractability of the optimal decision in any interesting scenarios, we manage to rewrite it as a saddle-point problem. Leveraging the literature on variational Bayes (VB), we propose a relaxation of the original problem. This novel methodology results in an efficient algorithm that not only performs well but is also provably convergent. Furthermore, we provide theoretical results on the statistical consistency of the resulting decision with the optimal Bayesian decision. Using real data, our proposal significantly enhances the speed and scalability of portfolio selection problems. We benchmark our results against state-of-the-art algorithms, as well as a Monte Carlo algorithm targeting the optimal decision.

artificial intelligence, decision support system, machine learning, (18 more...)

arXiv.org Machine Learning

2411.06192

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Banking & Finance > Trading (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Decision Support Systems (0.88)

Add feedback

Variational Low-Rank Adaptation Using IVON

Cong, Bai, Daheim, Nico, Shen, Yuesong, Cremers, Daniel, Yokota, Rio, Khan, Mohammad Emtiyaz, Möllenhoff, Thomas

arXiv.org Machine LearningNov-9-2024

We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models.

ivon, large language model, machine learning, (16 more...)

arXiv.org Machine Learning

2411.04421

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Streaming Bayes GFlowNets

da Silva, Tiago, de Souza, Daniel Augusto, Mesquita, Diego

arXiv.org Artificial IntelligenceNov-8-2024

Bayes' rule naturally allows for inference refinement in a streaming fashion, without the need to recompute posteriors from scratch whenever new data arrives. In principle, Bayesian streaming is straightforward: we update our prior with the available data and use the resulting posterior as a prior when processing the next data chunk. In practice, however, this recipe entails i) approximating an intractable posterior at each time step; and ii) encapsulating results appropriately to allow for posterior propagation. For continuous state spaces, variational inference (VI) is particularly convenient due to its scalability and the tractability of variational posteriors. For discrete state spaces, however, state-of-the-art VI results in analytically intractable approximations that are ill-suited for streaming settings. To enable streaming Bayesian inference over discrete parameter spaces, we propose streaming Bayes GFlowNets (abbreviated as SB-GFlowNets) by leveraging the recently proposed GFlowNets -- a powerful class of amortized samplers for discrete compositional objects. Notably, SB-GFlowNet approximates the initial posterior using a standard GFlowNet and subsequently updates it using a tailored procedure that requires only the newly observed data. Our case studies in linear preference learning and phylogenetic inference showcase the effectiveness of SB-GFlowNets in sampling from an unnormalized posterior in a streaming setting. As expected, we also observe that SB-GFlowNets is significantly faster than repeatedly training a GFlowNet from scratch to sample from the full posterior.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2411.05899

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.67)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Learning Mixtures of Experts with EM

Fruytier, Quentin, Mokhtari, Aryan, Sanghavi, Sujay

arXiv.org Machine LearningNov-8-2024

Mixtures of Experts (MoE) are Machine Learning models that involve partitioning the input space, with a separate "expert" model trained on each partition. Recently, MoE have become popular as components in today's large language models as a means to reduce training and inference costs. There, the partitioning function and the experts are both learnt jointly via gradient descent on the log-likelihood. In this paper we focus on studying the efficiency of the Expectation Maximization (EM) algorithm for the training of MoE models. We first rigorously analyze EM for the cases of linear or logistic experts, where we show that EM is equivalent to Mirror Descent with unit step size and a Kullback-Leibler Divergence regularizer. This perspective allows us to derive new convergence results and identify conditions for local linear convergence based on the signal-to-noise ratio (SNR). Experiments on synthetic and (small-scale) real-world data show that EM outperforms the gradient descent algorithm both in terms of convergence rate and the achieved accuracy.

artificial intelligence, exp, machine learning, (20 more...)

arXiv.org Machine Learning

2411.06056

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Filling in Missing FX Implied Volatilities with Uncertainties: Improving VAE-Based Volatility Imputation

Gopal, Achintya

arXiv.org Machine LearningNov-8-2024

Missing data is a common problem in finance and often requires methods to fill in the gaps, or in other words, imputation. In this work, we focused on the imputation of missing implied volatilities for FX options. Prior work has used variational autoencoders (VAEs), a neural network-based approach, to solve this problem; however, using stronger classical baselines such as Heston with jumps can significantly outperform their results. We show that simple modifications to the architecture of the VAE lead to significant imputation performance improvements (e.g., in low missingness regimes, nearly cutting the error by half), removing the necessity of using $\beta$-VAEs. Further, we modify the VAE imputation algorithm in order to better handle the uncertainty in data, as well as to obtain accurate uncertainty estimates around imputed values.

artificial intelligence, machine learning, vae, (19 more...)

arXiv.org Machine Learning

2411.05998

Country:

Europe > France > Hauts-de-France > Nord > Lille (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Genre: Research Report (0.63)

Industry: Banking & Finance > Trading (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data

Risser, Mark D., Noack, Marcus M., Luo, Hengrui, Pandolfi, Ronald

arXiv.org Machine LearningNov-7-2024

The Gaussian process (GP) is a widely used probabilistic machine learning method for stochastic function approximation, stochastic modeling, and analyzing real-world measurements of nonlinear processes. Unlike many other machine learning methods, GPs include an implicit characterization of uncertainty, making them extremely useful across many areas of science, technology, and engineering. Traditional implementations of GPs involve stationary kernels (also termed covariance functions) that limit their flexibility and exact methods for inference that prevent application to data sets with more than about ten thousand points. Modern approaches to address stationarity assumptions generally fail to accommodate large data sets, while all attempts to address scalability focus on approximating the Gaussian likelihood, which can involve subjectivity and lead to inaccuracies. In this work, we explicitly derive an alternative kernel that can discover and encode both sparsity and nonstationarity. We embed the kernel within a fully Bayesian GP model and leverage high-performance computing resources to enable the analysis of massive data sets. We demonstrate the favorable performance of our novel kernel relative to existing exact and approximate GP methods across a variety of synthetic data examples. Furthermore, we conduct space-time prediction based on more than one million measurements of daily maximum temperature and verify that our results outperform state-of-the-art methods in the Earth sciences. More broadly, having access to exact GPs that use ultra-scalable, sparsity-discovering, nonstationary kernels allows GP methods to truly compete with a wide variety of machine learning methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2411.05869

Country:

North America > United States > Texas > Harris County > Houston (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(11 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Inverse Transition Learning: Learning Dynamics from Demonstrations

Benac, Leo, Sharma, Abhishek, Parbhoo, Sonali, Doshi-Velez, Finale

arXiv.org Machine LearningNov-7-2024

We consider the problem of estimating the transition dynamics $T^*$ from near-optimal expert trajectories in the context of offline model-based reinforcement learning. We develop a novel constraint-based method, Inverse Transition Learning, that treats the limited coverage of the expert trajectories as a \emph{feature}: we use the fact that the expert is near-optimal to inform our estimate of $T^*$. We integrate our constraints into a Bayesian approach. Across both synthetic environments and real healthcare scenarios like Intensive Care Unit (ICU) patient management in hypotension, we demonstrate not only significant improvements in decision-making, but that our posterior can inform when transfer will be successful.

artificial intelligence, constraint, machine learning, (15 more...)

arXiv.org Machine Learning

2411.05174

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Health Care Providers & Services (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Differentiable Calibration of Inexact Stochastic Simulation Models via Kernel Score Minimization

Su, Ziwei, Klabjan, Diego

arXiv.org Artificial IntelligenceNov-7-2024

Stochastic simulation models are generative models that mimic complex systems to help with decision-making. The reliability of these models heavily depends on well-calibrated input model parameters. However, in many practical scenarios, only output-level data are available to learn the input model parameters, which is challenging due to the often intractable likelihood of the stochastic simulation model. Moreover, stochastic simulation models are frequently inexact, with discrepancies between the model and the target system. No existing methods can effectively learn and quantify the uncertainties of input parameters using only output-level data. In this paper, we propose to learn differentiable input parameters of stochastic simulation models using output-level data via kernel score minimization with stochastic gradient descent. We quantify the uncertainties of the learned input parameters using a frequentist confidence set procedure based on a new asymptotic normality result that accounts for model inexactness. The proposed method is evaluated on exact and inexact G/G/1 queueing models.

artificial intelligence, machine learning, modeling & simulation, (15 more...)

arXiv.org Artificial Intelligence

2411.05315

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
(2 more...)

Add feedback

Uncertainty Prediction Neural Network (UpNet): Embedding Artificial Neural Network in Bayesian Inversion Framework to Quantify the Uncertainty of Remote Sensing Retrieval

Fan, Dasheng, Mu, Xihan, Lai, Yongkang, Xie, Donghui, Yan, Guangjian

arXiv.org Artificial IntelligenceNov-7-2024

For the retrieval of large-scale vegetation biophysical parameters, the inversion of radiative transfer models (RTMs) is the most commonly used approach. In recent years, Artificial Neural Network (ANN)-based methods have become the mainstream for inverting RTMs due to their high accuracy and computational efficiency. It has been widely used in the retrieval of biophysical variables (BV). However, due to the lack of the Bayesian inversion theory interpretation, it faces challenges in quantifying the retrieval uncertainty, a crucial metric for product quality validation and downstream applications such as data assimilation or ecosystem carbon cycling modeling. This study proved that the ANN trained with squared loss outputs the posterior mean, providing a rigorous foundation for its uncertainty quantification, regularization, and incorporation of prior information. A Bayesian theoretical framework was subsequently proposed for ANN-based methods. Using this framework, we derived a new algorithm called Uncertainty Prediction Neural Network (UpNet), which enables the simultaneous training of two ANNs to retrieve BV and provide retrieval uncertainty. To validate our method, we compared UpNet with the standard Bayesian inference method, i.e., Markov Chain Monte Carlo (MCMC), in the inversion of a widely used RTM called ProSAIL for retrieving BVs and estimating uncertainty. The results demonstrated that the BVs retrieved and the uncertainties estimated by UpNet were highly consistent with those from MCMC, achieving over a million-fold acceleration. These results indicated that UpNet has significant potential for fast retrieval and uncertainty quantification of BVs or other parameters with medium and high-resolution remote sensing data. Our Python implementation is available at: https://github.com/Dash-RSer/UpNet.

mcmc, reflectance, retrieval uncertainty, (16 more...)

arXiv.org Artificial Intelligence

2411.04556

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York (0.04)
Europe (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.65)
Food & Agriculture > Agriculture (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback

Conjugate gradient methods for high-dimensional GLMMs

Pandolfi, Andrea, Papaspiliopoulos, Omiros, Zanella, Giacomo

arXiv.org Machine LearningNov-7-2024

Generalized linear mixed models (GLMMs) are a widely used tool in statistical analysis. The main bottleneck of many computational approaches lies in the inversion of the high dimensional precision matrices associated with the random effects. Such matrices are typically sparse; however, the sparsity pattern resembles a multi partite random graph, which does not lend itself well to default sparse linear algebra techniques. Notably, we show that, for typical GLMMs, the Cholesky factor is dense even when the original precision is sparse. We thus turn to approximate iterative techniques, in particular to the conjugate gradient (CG) method. We combine a detailed analysis of the spectrum of said precision matrices with results from random graph theory to show that CG-based methods applied to high-dimensional GLMMs typically achieve a fixed approximation error with a total cost that scales linearly with the number of parameters and observations. Numerical illustrations with both real and simulated data confirm the theoretical findings, while at the same time illustrating situations, such as nested structures, where CG-based methods struggle.

eigenvalue, iteration, matrix, (16 more...)

arXiv.org Machine Learning

2411.04729

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Industry: Government > Voting & Elections (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback