Bayesian Inference
Improving the evaluation of samplers on multi-modal targets
Grenioux, Louis, Noble, Maxence, Gabriรฉ, Marylou
Addressing multi-modality constitutes one of the major challenges of sampling. In this reflection paper, we advocate for a more systematic evaluation of samplers towards two sources of difficulty that are mode separation and dimension. For this, we propose a synthetic experimental setting that we illustrate on a selection of samplers, focusing on the challenging criterion of recovery of the mode relative importance. These evaluations are crucial to diagnose the potential of samplers to handle multi-modality and therefore to drive progress in the field.
Accelerated Stein Variational Gradient Flow
Stein variational gradient descent (SVGD) is a kernel-based particle method for sampling from a target distribution, e.g., in generative modeling and Bayesian inference. SVGD does not require estimating the gradient of the log-density, which is called score estimation. In practice, SVGD can be slow compared to score-estimation based sampling algorithms. To design fast and efficient high-dimensional sampling algorithms, we introduce ASVGD, an accelerated SVGD, based on an accelerated gradient flow in a metric space of probability densities following Nesterov's method. We then derive a momentum-based discrete-time sampling algorithm, which evolves a set of particles deterministically. To stabilize the particles' momentum update, we also study a Wasserstein metric regularization. For the generalized bilinear kernel and the Gaussian kernel, toy numerical examples with varied target distributions demonstrate the effectiveness of ASVGD compared to SVGD and other popular sampling methods.
Stacking Variational Bayesian Monte Carlo
Silvestrin, Francesco, Li, Chengkun, Acerbi, Luigi
Variational Bayesian Monte Carlo (VBMC) is a sample-efficient method for approximate Bayesian inference with computationally expensive likelihoods. While VBMC's local surrogate approach provides stable approximations, its conservative exploration strategy and limited evaluation budget can cause it to miss regions of complex posteriors. In this work, we introduce Stacking Variational Bayesian Monte Carlo (S-VBMC), a method that constructs global posterior approximations by merging independent VBMC runs through a principled and inexpensive post-processing step. Our approach leverages VBMC's mixture posterior representation and per-component evidence estimates, requiring no additional likelihood evaluations while being naturally parallelizable. We demonstrate S-VBMC's effectiveness on two synthetic problems designed to challenge VBMC's exploration capabilities and two real-world applications from computational neuroscience, showing substantial improvements in posterior approximation quality across all cases.
Cramer-Rao Bounds for Laplacian Matrix Estimation
Halihal, Morad, Routtenberg, Tirza, Poor, H. Vincent
Abstract--In this paper, we analyze the performance of the estimation of Laplacian matrices under general observatio n models. Laplacian matrix estimation involves structural c on-straints, including symmetry and null-space properties, a long with matrix sparsity. By exploiting a linear reparametriza tion that enforces the structural constraints, we derive closed -form matrix expressions for the Cram er-Rao Bound (CRB) specifically tailored to Laplacian matrix estimation. We further extend the derivation to the sparsity-constrained case, introduc ing two oracle CRBs that incorporate prior information of the suppo rt set, i.e. the locations of the nonzero entries in the Laplaci an matrix. We examine the properties and order relations betwe en the bounds, and provide the associated Slepian-Bangs formu la for the Gaussian case. We demonstrate the use of the new CRBs in three representative applications: (i) topology identi fication in power systems, (ii) graph filter identification in diffuse d models, and (iii) precision matrix estimation in Gaussian M arkov random fields under Laplacian constraints. The CRBs are eval - uated and compared with the mean-squared-errors (MSEs) of the constrained maximum likelihood estimator (CMLE), whic h integrates both equality and inequality constraints along with sparsity constraints, and of the oracle CMLE, which knows the locations of the nonzero entries of the Laplacian matrix . We perform this analysis for the applications of power syste m topology identification and graphical LASSO, and demonstra te that the MSEs of the estimators converge to the CRB and oracle CRB, given a sufficient number of measurements. Graph-structured data and signals arise in numerous applications, including power systems, communications, finance, social networks, and biological networks, for analysis and inference of networks [ 2 ], [ 3 ]. In this context, the Laplacian matrix, which captures node connectivity and edge weights, serves as a fundamental tool for clustering [ 4 ], modeling graph diffusion processes [ 5 ], [ 6 ], topology inference [ 6 ]-[ 12 ], anomaly detection [ 13 ], graph-based filtering [ 14 ]-[ 18 ], and analyzing smoothness on graphs [ 19 ]. M. Halihal and T. Routtenberg are with the School of Electric al and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel, e-mail: moradha@post.bgu.ac.il, tirzar@b gu.ac.il.
Adaptive sparse variational approximations for Gaussian process regression
Nieman, Dennis, Szabรณ, Botond
Department of Decision Sciences, Bocconi Institute for Data Science and Analytics, Bocconi University, Milan Abstract Accurate tuning of hyperparameters is crucial to ensure that models can generalise effectively across different settings. We construct a variational approximation to a hierarchical Bayes procedure, and derive upper bounds for the contraction rate of the variational posterior in an abstract setting. The theory is applied to various Gaussian process priors and variational classes, resulting in minimax optimal rates. Our theoretical results are accompanied with numerical analysis both on synthetic and real world data sets. Keywords: variational inference, Bayesian model selection, Gaussian processes, nonparametric regression, adaptation, posterior contraction rates 1 Introduction A core challenge in Bayesian statistics is scalability, i.e. the computation of the posterior for large sample sizes. Variational Bayes approximation is a standard approach to speed up inference. Variational posteriors are random probability measures that minimise the Kullback-Leibler divergence between a suitable class of distributions and the otherwise hard to compute posterior. Typically, the variational class of distributions over which the optimisation takes place does not contain the original posterior, hence the variational procedure can be viewed as a projection onto this class. The projected variational distribution then approximates the posterior. During the approximation procedure one inevitably loses information and hence it is important to characterize the accuracy of the approach. Despite the wide use of variational approximations, their theoretical underpinning started to emerge only recently, see for instance Alquier and Ridgway (2020); Yang et al. (2020); Zhang and Gao (2020a); Ray and Szab o (2022). In a Bayesian procedure, the choice of prior reflects the presumed properties of the unknown parameter. In comparison to regular parametric models, where in view of the Bernstein-von Mises theorem the posterior is asymptotically normal, the prior plays a crucial role in the asymptotic behaviour of the posterior. In fact, the large-sample behaviour of the posterior typically depends intricately on the choice of prior hyperparam-eters, so it is vital that these are tuned correctly. The two classical approaches are hierarchical and empirical Bayes methods.
Multi-resolution Score-Based Variational Graphical Diffusion for Causal Disaster System Modeling and Inference
Li, Xuechun, Gao, Shan, Xu, Susu
Complex systems with intricate causal dependencies challenge accurate prediction. Effective modeling requires precise physical process representation, integration of interdependent factors, and incorporation of multi-resolution observational data. These systems manifest in both static scenarios with instantaneous causal chains and temporal scenarios with evolving dynamics, complicating modeling efforts. Current methods struggle to simultaneously handle varying resolutions, capture physical relationships, model causal dependencies, and incorporate temporal dynamics, especially with inconsistently sampled data from diverse sources. We introduce Temporal-SVGDM: Score-based Variational Graphical Diffusion Model for Multi-resolution observations. Our framework constructs individual SDEs for each variable at its native resolution, then couples these SDEs through a causal score mechanism where parent nodes inform child nodes' evolution. This enables unified modeling of both immediate causal effects in static scenarios and evolving dependencies in temporal scenarios. In temporal models, state representations are processed through a sequence prediction model to predict future states based on historical patterns and causal relationships. Experiments on real-world datasets demonstrate improved prediction accuracy and causal understanding compared to existing methods, with robust performance under varying levels of background knowledge. Our model exhibits graceful degradation across different disaster types, successfully handling both static earthquake scenarios and temporal hurricane and wildfire scenarios, while maintaining superior performance even with limited data.
Incorporating the ChEES Criterion into Sequential Monte Carlo Samplers
Millard, Andrew, Murphy, Joshua, Frisch, Daniel, Maskell, Simon
Markov chain Monte Carlo (MCMC) methods are a powerful but computationally expensive way of performing non-parametric Bayesian inference. MCMC proposals which utilise gradients, such as Hamiltonian Monte Carlo (HMC), can better explore the parameter space of interest if the additional hyper-parameters are chosen well. The No-U-Turn Sampler (NUTS) is a variant of HMC which is extremely effective at selecting these hyper-parameters but is slow to run and is not suited to GPU architectures. An alternative to NUTS, Change in the Estimator of the Expected Square HMC (ChEES-HMC) was shown not only to run faster than NUTS on GPU but also sample from posteriors more efficiently. Sequential Monte Carlo (SMC) samplers are another sampling method which instead output weighted samples from the posterior. They are very amenable to parallelisation and therefore being run on GPUs while having additional flexibility in their choice of proposal over MCMC. We incorporate (ChEEs-HMC) as a proposal into SMC samplers and demonstrate competitive but faster performance than NUTS on a number of tasks.
Barrier Certificates for Unknown Systems with Latent States and Polynomial Dynamics using Bayesian Inference
Lefringhausen, Robert, Hanna, Sami Leon Noel Aziz, August, Elias, Hirche, Sandra
-- Certifying safety in dynamical systems is crucial, but barrier certificates -- widely used to verify that system trajectories remain within a safe region -- typically require explicit system models. When dynamics are unknown, data-driven methods can be used instead, yet obtaining a valid certificate requires rigorous uncertainty quantification. For this purpose, existing methods usually rely on full-state measurements, limiting their applicability. This paper proposes a novel approach for synthesizing barrier certificates for unknown systems with latent states and polynomial dynamics. A Bayesian framework is employed, where a prior in state-space representation is updated using input-output data via a targeted marginal Metropolis-Hastings sampler . The resulting samples are used to construct a candidate barrier certificate through a sum-of-squares program. It is shown that if the candidate satisfies the required conditions on a test set of additional samples, it is also valid for the true, unknown system with high probability. The approach and its probabilistic guarantees are illustrated through a numerical simulation. Ensuring the safety of dynamical systems is a critical concern in applications such as human-robot interaction, autonomous driving, and medical devices, where failures can lead to severe consequences. In such scenarios, safety constraints typically mandate that the system state remains within a predefined allowable region. Barrier certificates [1] provide a systematic framework for verifying safety by establishing mathematical conditions that guarantee that system trajectories remain within these regions.
Informed Greedy Algorithm for Scalable Bayesian Network Fusion via Minimum Cut Analysis
Torrijos, Pablo, Puerta, Josรฉ M., Gรกmez, Josรฉ A., Aledo, Juan A.
This paper presents the Greedy Min-Cut Bayesian Consensus (GMCBC) algorithm for the structural fusion of Bayesian Networks (BNs). The method is designed to preserve essential dependencies while controlling network complexity. It addresses the limitations of traditional fusion approaches, which often lead to excessively complex models that are impractical for inference, reasoning, or real-world applications. As the number and size of input networks increase, this issue becomes even more pronounced. GMCBC integrates principles from flow network theory into BN fusion, adapting the Backward Equivalence Search (BES) phase of the Greedy Equivalence Search (GES) algorithm and applying the Ford-Fulkerson algorithm for minimum cut analysis. This approach removes non-essential edges, ensuring that the fused network retains key dependencies while minimizing unnecessary complexity. Experimental results on synthetic Bayesian Networks demonstrate that GMCBC achieves near-optimal network structures. In federated learning simulations, GMCBC produces a consensus network that improves structural accuracy and dependency preservation compared to the average of the input networks, resulting in a structure that better captures the real underlying (in)dependence relationships. This consensus network also maintains a similar size to the original networks, unlike unrestricted fusion methods, where network size grows exponentially.
Bayesian Predictive Coding
Tschantz, Alexander, Koudahl, Magnus, Linander, Hampus, Da Costa, Lancelot, Heins, Conor, Beck, Jeff, Buckley, Christopher
Predictive coding (PC) is an influential theory of information processing in the brain, providing a biologically plausible alternative to backpropagation. It is motivated in terms of Bayesian inference, as hidden states and parameters are optimised via gradient descent on variational free energy. However, implementations of PC rely on maximum \textit{a posteriori} (MAP) estimates of hidden states and maximum likelihood (ML) estimates of parameters, limiting their ability to quantify epistemic uncertainty. In this work, we investigate a Bayesian extension to PC that estimates a posterior distribution over network parameters. This approach, termed Bayesian Predictive coding (BPC), preserves the locality of PC and results in closed-form Hebbian weight updates. Compared to PC, our BPC algorithm converges in fewer epochs in the full-batch setting and remains competitive in the mini-batch setting. Additionally, we demonstrate that BPC offers uncertainty quantification comparable to existing methods in Bayesian deep learning, while also improving convergence properties. Together, these results suggest that BPC provides a biologically plausible method for Bayesian learning in the brain, as well as an attractive approach to uncertainty quantification in deep learning.