mcmc
Variational predictive resampling
Battaglia, Laura, Cortinovis, Stefano, Holmes, Chris, Frazier, David T., Jewson, Jack
Bayesian inference provides principled uncertainty quantification, but accurate posterior sampling with MCMC can be computationally prohibitive for modern applications. Variational inference (VI) offers a scalable alternative and often yields accurate predictive distributions, but cheap variational families such as mean-field (MF) can produce over-concentrated approximations that miss posterior dependence. We propose variational predictive resampling (VPR), a scalable posterior sampling method that exploits VI's predictive strength within a predictive-resampling framework to better approximate the Bayesian posterior. Given a prior-likelihood pair, VPR repeatedly imputes future observations from the current variational predictive, updates the variational approximation after each imputation, and records the parameter value implied by the completed sample. We establish conditions under which the law of the parameter returned by VPR is well defined and show that its finite-horizon approximation converges to this limit. In a tractable Gaussian location model, we show that VPR with MF variational predictives converges to the exact Bayesian posterior, whereas the optimal MF-VI approximation retains a non-vanishing asymptotic gap. Experiments on linear regression, logistic regression, and hierarchical linear mixed-effects models demonstrate that VPR substantially improves posterior uncertainty quantification and recovers posterior dependence missed by MF-VI, while remaining computationally competitive with, and often more efficient than, MCMC.
High-dimensional Adaptive MCMC with Reduced Computational Complexity
Hird, Max, Livingstone, Samuel
We propose an adaptive MCMC method that learns a linear preconditioner which is dense in its off-diagonal elements but sparse in its parametrisation. Due to this sparsity, we achieve a per-iteration computational complexity of $O(m^2d)$ for a user-determined parameter $m$, compared with the $O(d^2)$ complexity of existing adaptive strategies that can capture correlation information from the target. Diagonal preconditioning has an $O(d)$ per-iteration complexity, but is known to fail in the case that the target distribution is highly correlated, see \citet[Section 3.5]{hird2025a}. Our preconditioner is constructed using eigeninformation from the target covariance which we infer using online principal components analysis on the MCMC chain. It is composed of a diagonal matrix and a product of carefully chosen reflection matrices. On various numerical tests we show that it outperforms diagonal preconditioning in terms of absolute performance, and that it outperforms traditional dense preconditioning and multiple diagonal plus low-rank alternatives in terms of time-normalised performance.
Markovian Flow Matching: Accelerating MCMC with Continuous Normalizing Flows
Continuous normalizing flows (CNFs) learn the probability path between a reference distribution and a target distribution by modeling the vector field generating said path using neural networks. Recently, Lipman et al. (2022) introduced a simple and inexpensive method for training CNFs in generative modeling, termed flow matching (FM). In this paper, we repurpose this method for probabilistic inference by incorporating Markovian sampling methods in evaluating the FM objective, and using the learned CNF to improve Monte Carlo sampling. Specifically, we propose an adaptive Markov chain Monte Carlo (MCMC) algorithm, which combines a local Markov transition kernel with a non-local, flow-informed transition kernel, defined using a CNF. This CNF is adapted on-the-fly using samples from the Markov chain, which are used to specify the probability path for the FM objective. Our method also includes an adaptive tempering mechanism that allows the discovery of multiple modes in the target distribution. Under mild assumptions, we establish convergence of our method to a local optimum of the FM objective.
A-NICE-MC: Adversarial Training for MCMC
Existing Markov Chain Monte Carlo (MCMC) methods are either based on general-purpose and domain-agnostic schemes, which can lead to slow convergence, or require hand-crafting of problem-specific proposals by an expert. We propose A-NICE-MC, a novel method to train flexible parametric Markov chain kernels to produce samples with desired properties. First, we propose an efficient likelihood-free adversarial training method to train a Markov chain and mimic a given data distribution. Then, we leverage flexible volume preserving flows to obtain parametric kernels for MCMC. Using a bootstrap approach, we show how to train efficient Markov Chains to sample from a prescribed posterior distribution by iteratively improving the quality of both the model and the samples. A-NICE-MC provides the first framework to automatically design efficient domain-specific MCMC proposals. Empirical results demonstrate that A-NICE-MC combines the strong guarantees of MCMC with the expressiveness of deep neural networks, and is able to significantly outperform competing methods such as Hamiltonian Monte Carlo.
Mental Sampling in Multimodal Representations
Both resources in the natural environment and concepts in a semantic space are distributed patchily, with large gaps in between the patches. To describe people's internal and external foraging behavior, various random walk models have been proposed. In particular, internal foraging has been modeled as sampling: in order to gather relevant information for making a decision, people draw samples from a mental representation using random-walk algorithms such as Markov chain Monte Carlo (MCMC). However, two common empirical observations argue against people using simple sampling algorithms such as MCMC for internal foraging. First, the distance between samples is often best described by a Levy flight distribution: the probability of the distance between two successive locations follows a power-law on the distances.
Preconditioned One-Step Generative Modeling for Bayesian Inverse Problems in Function Spaces
Cheng, Zilan, Wang, Li-Lian, Wang, Zhongjian
We propose a machine-learning algorithm for Bayesian inverse problems in the function-space regime based on one-step generative transport. Building on the Mean Flows, we learn a fully conditional amortized sampler with a neural-operator backbone that maps a reference Gaussian noise to approximate posterior samples. We show that while white-noise references may be admissible at fixed discretization, they become incompatible with the function-space limit, leading to instability in inference for Bayesian problems arising from PDEs. To address this issue, we adopt a prior-aligned anisotropic Gaussian reference distribution and establish the Lipschitz regularity of the resulting transport. Our method is not distilled from MCMC: training relies only on prior samples and simulated partial and noisy observations. Once trained, it generates a $64\times64$ posterior sample in $\sim 10^{-3}$s, avoiding the repeated PDE solves of MCMC while matching key posterior summaries.