Girolami, Mark
Probabilistic Super-Resolution for High-Fidelity Physical System Simulations with Uncertainty Quantification
Zhang, Pengyu, Duffin, Connor, Glyn-Davies, Alex, Vadeboncoeur, Arnaud, Girolami, Mark
A long standing challenge in the engineering sciences is accurately modelling physical systems, most notably when these are described by partial differential equations (PDEs). Highresolution simulations are critical in fields such as automotive and structural engineering [1, 2], where precise modelling of subtle physical behaviours is essential to inform engineering decisions. However, repeated evaluations of high-fidelity simulations using traditional numerical solvers, such as the Finite Element Method (FEM), has high computational costs and significant time requirements. This limitation poses challenges in applications like optimal design, where iterative simulations across varied parameter sets are necessary to achieve optimal configurations, making the process both slow and resource-intensive [3]. With the growing reliance on simulation-based predictions, ensuring computational efficiency alongside accuracy in high-fidelity simulations is paramount. To address some of these challenges, researchers have proposed using super-resolution (SR) techniques, from the field of computer vision [4, 5], to learn a mapping from low-resolution (LR) images to high-resolution (HR) images.
Generating Origin-Destination Matrices in Neural Spatial Interaction Models
Zachos, Ioannis, Girolami, Mark, Damoulas, Theodoros
Agent-based models (ABMs) are proliferating as decision-making tools across policy areas in transportation, economics, and epidemiology. In these models, a central object of interest is the discrete origin-destination matrix which captures spatial interactions and agent trip counts between locations. Existing approaches resort to continuous approximations of this matrix and subsequent ad-hoc discretisations in order to perform ABM simulation and calibration. This impedes conditioning on partially observed summary statistics, fails to explore the multimodal matrix distribution over a discrete combinatorial support, and incurs discretisation errors. To address these challenges, we introduce a computationally efficient framework that scales linearly with the number of origin-destination pairs, operates directly on the discrete combinatorial space, and learns the agents' trip intensity through a neural differential equation that embeds spatial interactions. Our approach outperforms the prior art in terms of reconstruction error and ground truth matrix coverage, at a fraction of the computational cost. We demonstrate these benefits in large-scale spatial mobility ABMs in Cambridge, UK and Washington, DC, USA.
Efficient Prior Calibration From Indirect Data
Akyildiz, O. Deniz, Girolami, Mark, Stuart, Andrew M., Vadeboncoeur, Arnaud
Bayesian inversion is central to the quantification of uncertainty within problems arising from numerous applications in science and engineering. To formulate the approach, four ingredients are required: a forward model mapping the unknown parameter to an element of a solution space, often the solution space for a differential equation; an observation operator mapping an element of the solution space to the data space; a noise model describing how noise pollutes the observations; and a prior model describing knowledge about the unknown parameter before the data is acquired. This paper is concerned with learning the prior model from data; in particular, learning the prior from multiple realizations of indirect data obtained through the noisy observation process. The prior is represented, using a generative model, as the pushforward of a Gaussian in a latent space; the pushforward map is learned by minimizing an appropriate loss function. A metric that is well-defined under empirical approximation is used to define the loss function for the pushforward map to make an implementable methodology. Furthermore, an efficient residual-based neural operator approximation of the forward model is proposed and it is shown that this may be learned concurrently with the pushforward map, using a bilevel optimization formulation of the problem; this use of neural operator approximation has the potential to make prior learning from indirect data more computationally efficient, especially when the observation process is expensive, non-smooth or not known. The ideas are illustrated with the Darcy flow inverse problem of finding permeability from piezometric head measurements.
Towards Multilevel Modelling of Train Passing Events on the Staffordshire Bridge
Bull, Lawrence A., Jeon, Chiho, Girolami, Mark, Duncan, Andrew, Schooling, Jennifer, Haro, Miguel Bravo
It is vital that we develop appropriate statistical models to represent and extract valuable insights from these large datasets, since the bridges constitute critical infrastructure within modern transportation networks. The process of monitoring engineered systems via streaming data is typically referred to as Structural Health Monitoring (SHM) and while successful applications have been emerging in recent years, a number of challenges remain for practical implementation [5]. During model design, these concerns usually centre around low variance data: that is, measurements are not available for the entire range of expected operational, environmental, and damage conditions. Consider a bridge following construction, this will have a relatively small dataset that should only be associated with normal operation. On the other hand, a structure with historical data might still not experience low-probability events - such as extreme weather or landslides. An obvious solution considers sharing data (or information) between structures; this has been the focus of a large body of recent work [6-8].
Targeted Separation and Convergence with Kernel Discrepancies
Barp, Alessandro, Simon-Gabriel, Carl-Johann, Girolami, Mark, Mackey, Lester
Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii). For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels and for controlling convergence with bounded kernels. We use these results on $\mathbb{R}^d$ to substantially broaden the known conditions for KSD separation and convergence control and to develop the first KSDs known to exactly metrize weak convergence to P. Along the way, we highlight the implications of our results for hypothesis testing, measuring and improving sample quality, and sampling with Stein variational gradient descent.
Improving embedding of graphs with missing data by soft manifolds
Marinoni, Andrea, Lio', Pietro, Barp, Alessandro, Jutten, Christian, Girolami, Mark
Embedding graphs in continous spaces is a key factor in designing and developing algorithms for automatic information extraction to be applied in diverse tasks (e.g., learning, inferring, predicting). The reliability of graph embeddings directly depends on how much the geometry of the continuous space matches the graph structure. Manifolds are mathematical structure that can enable to incorporate in their topological spaces the graph characteristics, and in particular nodes distances. State-of-the-art of manifold-based graph embedding algorithms take advantage of the assumption that the projection on a tangential space of each point in the manifold (corresponding to a node in the graph) would locally resemble a Euclidean space. Although this condition helps in achieving efficient analytical solutions to the embedding problem, it does not represent an adequate set-up to work with modern real life graphs, that are characterized by weighted connections across nodes often computed over sparse datasets with missing records. In this work, we introduce a new class of manifold, named soft manifold, that can solve this situation. In particular, soft manifolds are mathematical structures with spherical symmetry where the tangent spaces to each point are hypocycloids whose shape is defined according to the velocity of information propagation across the data points. Using soft manifolds for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets. Experimental results on reconstruction tasks on synthetic and real datasets show how the proposed approach enable more accurate and reliable characterization of graphs in continuous spaces with respect to the state-of-the-art.
Riemannian Laplace Approximation with the Fisher Metric
Yu, Hanlin, Hartmann, Marcelo, Williams, Bernardo, Girolami, Mark, Klami, Arto
The Laplace's method approximates a target density with a Gaussian distribution at its mode. It is computationally efficient and asymptotically exact for Bayesian inference due to the Bernstein-von Mises theorem, but for complex targets and finite-data posteriors it is often too crude an approximation. A recent generalization of the Laplace Approximation transforms the Gaussian approximation according to a chosen Riemannian geometry providing a richer approximation family, while still retaining computational efficiency. However, as shown here, its properties heavily depend on the chosen metric, indeed the metric adopted in previous work results in approximations that are overly narrow as well as being biased even at the limit of infinite data. We correct this shortcoming by developing the approximation family further, deriving two alternative variants that are exact at the limit of infinite data, extending the theoretical analysis of the method, and demonstrating practical improvements in a range of experiments.
Inferring networks from time series: a neural approach
Gaskin, Thomas, Pavliotis, Grigorios A., Girolami, Mark
Network structures underlie the dynamics of many complex phenomena, from gene regulation and foodwebs to power grids and social media. Yet, as they often cannot be observed directly, their connectivities must be inferred from observations of the dynamics to which they give rise. In this work we present a powerful computational method to infer large network adjacency matrices from time series data using a neural network, in order to provide uncertainty quantification on the prediction in a manner that reflects both the degree to which the inference problem is underdetermined as well as the noise on the data. This is a feature that other approaches have hitherto been lacking. We demonstrate our method's capabilities by inferring line failure locations in the British power grid from its response to a power cut, providing probability densities on each edge and allowing the use of hypothesis testing to make meaningful probabilistic statements about the location of the cut. Our method is significantly more accurate than both Markov-chain Monte Carlo sampling and least squares regression on noisy data and when the problem is underdetermined, while naturally extending to the case of non-linear dynamics, which we demonstrate by learning an entire cost matrix for a non-linear model of economic activity in Greater London. Not having been specifically engineered for network inference, this method in fact represents a general parameter estimation scheme that is applicable to any high-dimensional parameter space.
Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood Estimation
Akyildiz, ร. Deniz, Crucinio, Francesca Romana, Girolami, Mark, Johnston, Tim, Sabanis, Sotirios
We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation (MMLE) procedure to estimate the parameters of a latent variable model. We achieve this by formulating a continuous-time interacting particle system which can be seen as a Langevin diffusion over an extended state space of parameters and latent variables. In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure where number of particles acts as the inverse temperature parameter in classical settings for global optimisation. Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error in a manner that is uniform in time and does not increase with the number of particles. The discretisation results in an algorithm, termed Interacting Particle Langevin Algorithm (IPLA) which can be used for MMLE. We further prove nonasymptotic bounds for the optimisation error of our estimator in terms of key parameters of the problem, and also extend this result to the case of stochastic gradients covering practical scenarios. We provide numerical experiments to illustrate the empirical behaviour of our algorithm in the context of logistic regression with verifiable assumptions. Our setting provides a straightforward way to implement a diffusion-based optimisation routine compared to more classical approaches such as the Expectation Maximisation (EM) algorithm, and allows for especially explicit nonasymptotic bounds.
Warped geometric information on the optimisation of Euclidean functions
Hartmann, Marcelo, Williams, Bernardo, Yu, Hanlin, Girolami, Mark, Barp, Alessandro, Klami, Arto
We consider the fundamental task of optimizing a real-valued function defined in a potentially high-dimensional Euclidean space, such as the loss function in many machine-learning tasks or the logarithm of the probability distribution in statistical inference. We use the warped Riemannian geometry notions to redefine the optimisation problem of a function on Euclidean space to a Riemannian manifold with a warped metric, and then find the function's optimum along this manifold. The warped metric chosen for the search domain induces a computational friendly metric-tensor for which optimal search directions associate with geodesic curves on the manifold becomes easier to compute. Performing optimization along geodesics is known to be generally infeasible, yet we show that in this specific manifold we can analytically derive Taylor approximations up to third-order. In general these approximations to the geodesic curve will not lie on the manifold, however we construct suitable retraction maps to pull them back onto the manifold. Therefore, we can efficiently optimize along the approximate geodesic curves. We cover the related theory, describe a practical optimization algorithm and empirically evaluate it on a collection of challenging optimisation benchmarks. Our proposed algorithm, using third-order approximation of geodesics, outperforms standard Euclidean gradient-based counterparts in term of number of iterations until convergence and an alternative method for Hessian-based optimisation routines.