Goto

Collaborating Authors

 Directed Networks


Bayesian inference with sources of uncertainty: from confidence modelling to sparse estimation

arXiv.org Machine Learning

We introduce a general framework that extends Bayesian inference by allowing the researcher to explicitly encode confidence in each source of uncertainty within the model. This mechanism provides a new handle for model design and regularisation control. Building on this framework, we develop a general approach for inducing sparsity in statistical models and illustrate its use in linear and logistic regression, as well as in Bayesian neural networks.


Adaptive graph-based algorithms for conditional anomaly detection and semi-supervised learning

arXiv.org Machine Learning

We develop graph-based methods for semi-supervised learning based on label propagation on a data similarity graph. When data is abundant or arrive in a stream, the problems of computation and data storage arise for any graph-based method. We propose a fast approximate online algorithm that solves for the harmonic solution on an approximate graph. We show, both empirically and theoretically, that good behavior can be achieved by collapsing nearby points into a set of local representative points that minimize distortion. Moreover, we regularize the harmonic solution to achieve better stability properties. We also present graph-based methods for detecting conditional anomalies and apply them to the identification of unusual clinical actions in hospitals. Our hypothesis is that patient-management actions that are unusual with respect to the past patients may be due to errors and that it is worthwhile to raise an alert if such a condition is encountered. Conditional anomaly detection extends standard unconditional anomaly framework but also faces new problems known as fringe and isolated points. We devise novel nonparametric graph-based methods to tackle these problems. Our methods rely on graph connectivity analysis and soft harmonic solution. Finally, we conduct an extensive human evaluation study of our conditional anomaly methods by 15 experts in critical care.


Amortized Variational Inference for Joint Posterior and Predictive Distributions in Bayesian Uncertainty Quantification

arXiv.org Machine Learning

Bayesian predictive inference propagates parameter uncertainty to quantities of interest through the posterior-predictive distribution. In practice, this is typically performed using a two-stage procedure: first approximating the posterior distribution of model parameters, and then propagating posterior samples through the predictive model via Monte Carlo simulation. This sequential workflow can be computationally demanding, particularly for high-fidelity models such as those governed by partial differential equations. We propose a variational Bayesian framework that directly targets the posterior-predictive distribution and jointly learns variational approximations of both the posterior and the corresponding predictive distribution. The formulation introduces a variational upper bound on the Kullback--Leibler divergence together with moment-based regularization terms. The variational distributions are trained in an amortized manner, shifting computational effort to an offline stage and enabling efficient online inference. Numerical experiments ranging from analytical benchmarks to a finite-element solid mechanics problem demonstrate that the proposed method achieves more accurate predictive distributions than conventional two-stage variational inference, while substantially reducing the cost of online predictive inference.


Conditional Diffusion Sampling

arXiv.org Machine Learning

Sampling from unnormalized multimodal distributions with limited density evaluations remains a fundamental challenge in machine learning and natural sciences. Successful approaches construct a bridge between a tractable reference and the target distribution. Parallel Tempering (PT) serves as the gold standard, while recent diffusion-based approaches offer a continuous alternative at the cost of neural training. In this work, we introduce Conditional Diffusion Sampling (CDS), a framework that combines these two paradigms. To this end, we derive Conditional Interpolants, a class of stochastic processes whose transport dynamics are governed by an exact, closed-form stochastic differential equation (SDE), requiring no neural approximation. Although these dynamics require sampling from a non-trivial initialization distribution, we show both theoretically and empirically that the cost of this initialization diminishes for sufficiently short diffusion times. CDS leverages this by a two-stage procedure: (1) PT is used to efficiently sample the initial distribution, and then (2) samples are transported via the transport SDE. This combination couples the robust global exploration of PT with efficient local transport. Experiments suggest that CDS has the potential to achieve a superior trade-off between sample quality and density evaluation cost compared to state-of-the-art samplers.


Design, Cups, and Blankets. A Free-Energy-Principle-Based Approach to Product Design

arXiv.org Machine Learning

Classical design theory treats the type of an object as a given: the designer decides in advance that this will be a cup, then optimizes its parameters. This paper argues that object type is not a presupposition but an inference, something that can be determined from physical data and functional requirements jointly. We call this problem requirement-steered interface type inference and show that it is inexpressible within existing design frameworks. This paper makes two contributions that are jointly necessary and individually incomplete. The first is the problem itself, which classical design cannot pose because it presupposes the very thing our problem seeks to determine. The second is C-DMBD, a constrained extension of the Dynamic Markov Blanket Detection algorithm, which makes requirement-steered inference computationally tractable. Drawing on the free-energy principle and active inference, established frameworks in theoretical neuroscience and Bayesian mechanics, we model a product's surface as a Markov blanket: the minimal boundary through which all causal exchange between object and environment must pass. Different blanket structures correspond to different object types; different parameterizations of the same structure correspond to different functional modes of the same type. This paper is a proof of concept and a theoretical proposal. It reframes design as inference rather than optimization, and as a relation between generative models rather than a specification of parameters.


MIRA: A Score for Conditional Distribution Accuracy and Model Comparison

arXiv.org Machine Learning

We introduce Mira, a sample-based score for assessing the accuracy of a candidate conditional distribution using only joint samples from the true data-generating process. Relying on the principle that distributions coincide if they assign equal probability mass to all regions, we derive an analytic expression for the Mira statistic, whose average defines the Mira score. This formulation further allows us to compute theoretical reference values and uncertainty estimates when the candidate distribution matches the true one. This framework enables model comparison by quantifying the alignment between the conditional distribution of a candidate model and the true data generating process. Consequently, Mira enables Bayesian model comparison through direct posterior validation, bypassing the challenging evidence computation. We demonstrate its effectiveness across several toy problems and Bayesian inference tasks.


Measuring Differences between Conditional Distributions using Kernel Embeddings

arXiv.org Machine Learning

Comparing conditional distributions is a fundamental challenge in statistics and machine learning, with applications across a wide range of domains. While proposed methods for measuring discrepancies using kernel embeddings of distributions in a reproducing kernel Hilbert space (RKHS) provide powerful non-parametric techniques, the existing literature remains fragmented and lacks a unified theoretical treatment. This paper addresses this gap by establishing a coherent framework for studying kernel-based methods to measure divergence between conditional distributions through what we refer to as conditional maximum mean discrepancy (CMMD). The CMMD consists of a family of metrics which we call levels, with three special cases each using a different type of RKHS embedding: CMMD$_0$ (conditional mean operators), CMMD$_1$ (conditional mean embeddings), and CMMD$_2$ (joint mean embeddings). We additionally introduce a general level $s$ CMMD, clarifying the required assumptions, and establishing mathematical connections between the levels through the lens of operator-based smoothing. In addition to reviewing previously proposed estimators, we introduce a novel doubly robust estimator for the CMMD that maintains consistency provided at least one of the underlying models is correctly specified. We provide numerical experiments demonstrating that the CMMD effectively captures complex conditional dependencies for statistical testing.


Online Generalised Predictive Coding

arXiv.org Machine Learning

Despite being confined within the interior darkness of the skull, the human brain possesses a remarkable ability to interpret, understand and analyse the world out there, plan for unseen futures, and make decisions that can alter the course of events. This extraordinary capability is conjectured to come from the brain's function as a predictive machine, constantly inferring the hidden causes of its sensory inputs to maintain a coherent model of its environment. This view, which dates back to Helmholtz's idea of "perception as unconscious inference" (von Helmholtz, 1866)--evolving into the "Bayesian brain" hypothesis (Doya et al., 2007)--suggests that the brain operates as a constructive statistical organ. It updates its beliefs about the external world based on incoming sensory data under a generative model (GM). The GM furnishes the brain with a structured representation that supports probabilistic beliefs over both the latent dynamical states of the external world, corresponding to the generative process (GP), as well as the observation mappings through which these states give rise to sensory signals. Essentially, the brain continually refines its probabilistic beliefs about both the latent states and the causal mechanisms of the world through a process of online triple estimation, jointly optimising beliefs over: hidden states, model parameters, and their associated uncertainties in accordance with the principles of Bayesian inference (Eells, 2004; Parr et al., 2022). More technically, given a sensory observation yt at time t, perception can be formulated as an online triple estimation scheme, whose three components are: 1) online hidden state inference, 2) online parameter learning, and 3) online uncertainty estimation, all three of which are the core components of our proposed online generalised PC scheme and are elaborated in Section.


The Bayesian Reflex: Online Learning as the Autonomic Nervous System of Modern and Future AI

arXiv.org Machine Learning

This chapter introduces the Bayesian reflex -- an analogy with the autonomic nervous system -- as a unifying framework for online learning in AI. Bayesian online algorithms automatically maintain equilibrium in dynamic environments via three mechanisms: belief maintenance through probabilistic representations, sequential updating via Bayes' theorem, and uncertainty-driven action balancing exploration and exploitation. We survey online Bayesian methods, highlighting two computational principles: the look-up table principle for sequential inference in function space, and the ellipsoidal decomposition framework for nearly exact i.i.d. sampling from arbitrary posteriors. These principles are generalized across dynamic emulation, nonparametric state-space models, circular time series, inverse regression for climate model evaluation, and deep architectures via Recursive Gaussian Processes. Decision-making is explored via Thompson sampling and restless bandits. We extend the framework to assess infinite series convergence (applied to climate dynamics and the Riemann Hypothesis), model prime number distributions leading to the discovery of 184 strong Mersenne prime candidates, detect stationarity, and characterize point processes. The Bayesian reflex provides a foundational infrastructure for adaptive AI that continuously learns in a complex world.


Provable and scalable quantum Gaussian processes for quantum learning

arXiv.org Machine Learning

Despite rapid recent advances in quantum machine learning, the field is in many ways stuck. Existing approaches can exhibit serious limitations, and we still lack learning frameworks that are simple, interpretable, scalable, and naturally suited to quantum data. To address this, here we introduce quantum Gaussian processes, a Bayesian framework for learning from quantum systems through priors over unknown quantum transformations. We show that, under suitable conditions, unitary quantum stochastic processes define Gaussian processes, thereby enabling regression, classification, and Bayesian optimization directly on quantum data. The key ingredient in this framework is sufficient knowledge of a quantum process's structure and symmetries to define an informative prior through its corresponding quantum kernel, effectively injecting a strong, physics-informed inductive bias into the learning model. We then prove that matchgate, or free-fermionic, evolutions give rise to provable and scalable quantum Gaussian processes, providing the first family in our framework where the unknown unitary acts non-trivially on all qubits. Finally, we demonstrate accurate long-range extrapolation, phase-diagram learning in many-body systems, and sample-efficient Bayesian optimization in a quantum sensing task. Our results identify quantum Gaussian processes as a promising route toward simpler and more structured forms of quantum learning.