AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Posterior Meta-Replay for Continual Learning

Neural Information Processing SystemsOct-11-2024, 06:20:34 GMT

In principle, Bayesian learning directly applies to this setting, since recursive and one-off Bayesian updates yield the same result. In practice, however, recursive updating often leads to poor trade-off solutions across tasks because approximate inference is necessary for most models of interest. Here, we describe an alternative Bayesian approach where task-conditioned parameter distributions are continually inferred from data. We offer a practical deep learning implementation of our framework based on probabilistic task-conditioned hypernetworks, an approach we term posterior meta-replay. Experiments on standard benchmarks show that our probabilistic hypernetworks compress sequences of posterior parameter distributions with virtually no forgetting.

continual learning, parameter distribution, posterior meta-replay, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Projected Stein Variational Newton: A Fast and Scalable Bayesian Inference Method in High Dimensions

Neural Information Processing SystemsOct-11-2024, 05:25:16 GMT

We propose a projected Stein variational Newton (pSVN) method for high-dimensional Bayesian inference. To address the curse of dimensionality, we exploit the intrinsic low-dimensional geometric structure of the posterior distribution in the high-dimensional parameter space via its Hessian (of the log posterior) operator and perform a parallel update of the parameter samples projected into a low-dimensional subspace by an SVN method. The subspace is adaptively constructed using the eigenvectors of the averaged Hessian at the current samples. We demonstrate fast convergence of the proposed method, complexity independent of the parameter and sample dimensions, and parallel scalability.

high dimension, scalable bayesian inference method, stein variational newton, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Decentralized Langevin Dynamics for Bayesian Learning

Neural Information Processing SystemsOct-11-2024, 04:39:34 GMT

Motivated by decentralized approaches to machine learning, we propose a collaborative Bayesian learning algorithm taking the form of decentralized Langevin dynamics in a non-convex setting. Our analysis show that the initial KL-divergence between the Markov Chain and the target posterior distribution is exponentially decreasing while the error contributions to the overall KL-divergence from the additive noise is decreasing in polynomial time. We further show that the polynomial-term experiences speed-up with number of agents and provide sufficient conditions on the time-varying step-sizes to guarantee convergence to the desired distribution. The performance of the proposed algorithm is evaluated on a wide variety of machine learning tasks. The empirical results show that the performance of individual agents with locally available data is on par with the centralized setting with considerable improvement in the convergence rate.

bayesian learning, decentralized langevin dynamic, kl-divergence, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)

Add feedback

D-VAE: A Variational Autoencoder for Directed Acyclic Graphs

Neural Information Processing SystemsOct-11-2024, 02:58:41 GMT

Graph structured data are abundant in the real world. Among different graph types, directed acyclic graphs (DAGs) are of particular interest to machine learning researchers, as many machine learning models are realized as computations on DAGs, including neural networks and Bayesian networks. In this paper, we study deep generative models for DAGs, and propose a novel DAG variational autoencoder (D-VAE). We propose an asynchronous message passing scheme that allows encoding the computations on DAGs, rather than using existing simultaneous message passing schemes to encode local graph structures. We demonstrate the effectiveness of our proposed DVAE through two tasks: neural architecture search and Bayesian network structure learning.

dag, directed acyclic graph, variational autoencoder, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.91)

Add feedback

Adaptive Experimental Design with Temporal Interference: A Maximum Likelihood Approach

Neural Information Processing SystemsOct-11-2024, 02:11:10 GMT

Suppose an online platform wants to compare a treatment and control policy (e.g., two different matching algorithms in a ridesharing system, or two different inventory management algorithms in an online retail site). Standard experimental approaches to this problem are biased (due to temporal interference between the policies), and not sample efficient. We study optimal experimental design for this setting. We view testing the two policies as the problem of estimating the steady state difference in reward between two unknown Markov chains (i.e., policies). We assume estimation of the steady state reward for each chain proceeds via nonparametric maximum likelihood, and search for consistent (i.e., asymptotically unbiased) experimental designs that are efficient (i.e., asymptotically minimum variance).

adaptive experimental design, maximum likelihood approach, temporal interference, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Add feedback

Data Augmentation MCMC for Bayesian Inference from Privatized Data

Neural Information Processing SystemsOct-11-2024, 01:45:17 GMT

Differentially private mechanisms protect privacy by introducing additional randomness into the data. Restricting access to only the privatized data makes it challenging to perform valid statistical inference on parameters underlying the confidential data. Specifically, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable. For Bayesian analysis, this results in a posterior distribution that is doubly intractable, rendering traditional MCMC techniques inapplicable. We propose an MCMC framework to perform Bayesian inference from the privatized data, which is applicable to a wide range of statistical models and privacy mechanisms.

bayesian inference, confidential data, data augmentation mcmc, (2 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.09)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)

Add feedback

Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data

Neural Information Processing SystemsOct-11-2024, 01:22:07 GMT

Continuous-time Bayesian Networks (CTBNs) represent a compact yet powerful framework for understanding multivariate time-series data. Given complete data, parameters and structure can be estimated efficiently in closed-form. However, if data is incomplete, the latent states of the CTBN have to be estimated by laboriously simulating the intractable dynamics of the assumed CTBN. This is a problem, especially for structure learning tasks, where this has to be done for each element of a super-exponentially growing set of possible structures. In order to circumvent this notorious bottleneck, we develop a novel gradient-based approach to structure learning.

continuous-time bayesian network, incomplete data, scalable structure learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Add feedback

Spike and slab variational Bayes for high dimensional logistic regression

Neural Information Processing SystemsOct-11-2024, 00:29:47 GMT

Variational Bayes (VB) is a popular scalable alternative to Markov chain Monte Carlo for Bayesian inference. We study a mean-field spike and slab VB approximation of widely used Bayesian model selection priors in sparse high-dimensional logistic regression. We provide non-asymptotic theoretical guarantees for the VB posterior in both \ell_2 and prediction loss for a sparse truth, giving optimal (minimax) convergence rates. Since the VB algorithm does not depend on the unknown truth to achieve optimality, our results shed light on effective prior choices. We confirm the improved performance of our VB algorithm over common sparse VB approaches in a numerical study.

high dimensional logistic regression, spike and slab variational bayes, vb algorithm

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Scalable Signature-Based Distribution Regression via Reference Sets

Alden, Andrew, Ventre, Carmine, Horvath, Blanka

arXiv.org Machine LearningOct-11-2024

Distribution Regression (DR) on stochastic processes describes the learning task of regression on collections of time series. Path signatures, a technique prevalent in stochastic analysis, have been used to solve the DR problem. Recent works have demonstrated the ability of such solutions to leverage the information encoded in paths via signature-based features. However, current state of the art DR solutions are memory intensive and incur a high computation cost. This leads to a trade-off between path length and the number of paths considered. This computational bottleneck limits the application to small sample sizes which consequently introduces estimation uncertainty. In this paper, we present a methodology for addressing the above issues; resolving estimation uncertainties whilst also proposing a pipeline that enables us to use DR for a wide variety of learning tasks. Integral to our approach is our novel distance approximator. This allows us to seamlessly apply our methodology across different application domains, sampling rates, and stochastic process dimensions. We show that our model performs well in applications related to estimation theory, quantitative finance, and physical sciences. We demonstrate that our model generalises well, not only to unseen data within a given distribution, but also under unseen regimes (unseen classes of stochastic models).

artificial intelligence, machine learning, stochastic process, (16 more...)

arXiv.org Machine Learning

2410.09196

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Deterministic Fokker-Planck Transport -- With Applications to Sampling, Variational Inference, Kernel Mean Embeddings & Sequential Monte Carlo

Klebanov, Ilja

arXiv.org Machine LearningOct-11-2024

The Fokker-Planck equation can be reformulated as a continuity equation, which naturally suggests using the associated velocity field in particle flow methods. While the resulting probability flow ODE offers appealing properties - such as defining a gradient flow of the Kullback-Leibler divergence between the current and target densities with respect to the 2-Wasserstein distance - it relies on evaluating the current probability density, which is intractable in most practical applications. By closely examining the drawbacks of approximating this density via kernel density estimation, we uncover opportunities to turn these limitations into advantages in contexts such as variational inference, kernel mean embeddings, and sequential Monte Carlo.

artificial intelligence, bayesian inference, tar, (13 more...)

arXiv.org Machine Learning

2410.18993

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(15 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback