Goto

Collaborating Authors

 Bayesian Inference


A Bayesian Framework for learning governing Partial Differential Equation from Data

arXiv.org Artificial Intelligence

The discovery of partial differential equations (PDEs) is a challenging task that involves both theoretical and empirical methods. Machine learning approaches have been developed and used to solve this problem; however, it is important to note that existing methods often struggle to identify the underlying equation accurately in the presence of noise. In this study, we present a new approach to discovering PDEs by combining variational Bayes and sparse linear regression. The problem of PDE discovery has been posed as a problem to learn relevant basis from a predefined dictionary of basis functions. To accelerate the overall process, a variational Bayes-based approach for discovering partial differential equations is proposed. To ensure sparsity, we employ a spike and slab prior. We illustrate the efficacy of our strategy in several examples, including Burgers, Korteweg-de Vries, Kuramoto Sivashinsky, wave equation, and heat equation (1D as well as 2D). Our method offers a promising avenue for discovering PDEs from data and has potential applications in fields such as physics, engineering, and biology.


Meta-learning Control Variates: Variance Reduction with Limited Data

arXiv.org Artificial Intelligence

Control variates can be a powerful tool to reduce the variance of Monte Carlo estimators, but constructing effective control variates can be challenging when the number of samples is small. In this paper, we show that when a large number of related integrals need to be computed, it is possible to leverage the similarity between these integration tasks to improve performance even when the number of samples per task is very small. Our approach, called meta learning CVs (Meta-CVs), can be used for up to hundreds or thousands of tasks. Our empirical assessment indicates that Meta-CVs can lead to significant variance reduction in such settings, and our theoretical analysis establishes general conditions under which Meta-CVs can be successfully trained.


Differentially Private Distributed Bayesian Linear Regression with MCMC

arXiv.org Artificial Intelligence

We propose a novel Bayesian inference framework for distributed differentially private linear regression. We consider a distributed setting where multiple parties hold parts of the data and share certain summary statistics of their portions in privacy-preserving noise. We develop a novel generative statistical model for privately shared statistics, which exploits a useful distributional relation between the summary statistics of linear regression. Bayesian estimation of the regression coefficients is conducted mainly using Markov chain Monte Carlo algorithms, while we also provide a fast version to perform Bayesian estimation in one iteration. The proposed methods have computational advantages over their competitors. We provide numerical results on both real and simulated data, which demonstrate that the proposed algorithms provide well-rounded estimation and prediction.


A Unified Initial Alignment Method of SINS Based on FGO

arXiv.org Artificial Intelligence

The initial alignment provides an accurate attitude for SINS (strapdown inertial navigation system). By further estimating the IMU's bias and misalignment angle, the recursive Bayesian filter is accurate. However, the prior heading error has significant influence on the convergence speed and accuracy. In addition, the accuracy will be limited by its iteration at a single time-step. Coarse alignment method OBA (optimization-based alignment) uses MLE (maximum likelihood estimation) to find the optimal attitude quickly. However, few methods consider the IMU bias and misalignment angle, which will reduce the attitude accuracy. In this paper, a unified method based on FGO (Factor graph optimization) and IBF (inertial base frame) is proposed. The attitude is estimated by MLE, IMU bias and misalignment angle are estimated by MAP estimation. The state of all time steps is optimized together to further improve the accuracy. Physical experiments on the rotation MEMS SINS show that the heading accuracy of this method is improved in limited alignment time.


Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

arXiv.org Artificial Intelligence

Selecting hyperparameters in deep learning greatly impacts its effectiveness but requires manual effort and expertise. Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like standard neural network parameters using gradients and on the training data. However, estimating a single hyperparameter gradient requires a pass through the entire dataset, limiting the scalability of such algorithms. In this work, we overcome this issue by introducing lower bounds to the linearized Laplace approximation of the marginal likelihood. In contrast to previous estimators, these bounds are amenable to stochastic-gradient-based optimization and allow to trade off estimation accuracy against computational complexity. We derive them using the function-space form of the linearized Laplace, which can be estimated using the neural tangent kernel. Experimentally, we show that the estimators can significantly accelerate gradient-based hyperparameter optimization.


Rao-Blackwellized Particle Smoothing for Simultaneous Localization and Mapping

arXiv.org Artificial Intelligence

Simultaneous localization and mapping (SLAM) is the task of building a map representation of an unknown environment while it at the same time is used for positioning. A probabilistic interpretation of the SLAM task allows for incorporating prior knowledge and for operation under uncertainty. Contrary to the common practice of computing point estimates of the system states, we capture the full posterior density through approximate Bayesian inference. This dynamic learning task falls under state estimation, where the state-of-the-art is in sequential Monte Carlo methods that tackle the forward filtering problem. In this paper, we introduce a framework for probabilistic SLAM using particle smoothing that does not only incorporate observed data in current state estimates, but it also back-tracks the updated knowledge to correct for past drift and ambiguities in both the map and in the states. Our solution can efficiently handle both dense and sparse map representations by Rao-Blackwellization of conditionally linear and conditionally linearized models. We show through simulations and real-world experiments how the principles apply to radio (BLE/Wi-Fi), magnetic field, and visual SLAM. The proposed solution is general, efficient, and works well under confounding noise.


Correction of Errors in Preference Ratings from Automated Metrics for Text Generation

arXiv.org Artificial Intelligence

A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreement with human judgments. In this paper, we propose a statistical model of Text Generation evaluation that accounts for the error-proneness of automated metrics when used to generate preference rankings between system outputs. We show that existing automated metrics are generally over-confident in assigning significant differences between systems in this setting. However, our model enables an efficient combination of human and automated ratings to remedy the error-proneness of the automated metrics. We show that using this combination, we only require about 50% of the human annotations typically used in evaluations to arrive at robust and statistically significant results while yielding the same evaluation outcome as the pure human evaluation in 95% of cases. We showcase the benefits of approach for three text generation tasks: dialogue systems, machine translation, and text summarization.


Physics Inspired Approaches To Understanding Gaussian Processes

arXiv.org Artificial Intelligence

Prior beliefs about the latent function to shape inductive biases can be incorporated into a Gaussian Process (GP) via the kernel. However, beyond kernel choices, the decision-making process of GP models remains poorly understood. In this work, we contribute an analysis of the loss landscape for GP models using methods from physics. We demonstrate $\nu$-continuity for Matern kernels and outline aspects of catastrophe theory at critical points in the loss landscape. By directly including $\nu$ in the hyperparameter optimisation for Matern kernels, we find that typical values of $\nu$ are far from optimal in terms of performance, yet prevail in the literature due to the increased computational speed. We also provide an a priori method for evaluating the effect of GP ensembles and discuss various voting approaches based on physical properties of the loss landscape. The utility of these approaches is demonstrated for various synthetic and real datasets. Our findings provide an enhanced understanding of the decision-making process behind GPs and offer practical guidance for improving their performance and interpretability in a range of applications.


Deep surrogate accelerated delayed-acceptance HMC for Bayesian inference of spatio-temporal heat fluxes in rotating disc systems

arXiv.org Artificial Intelligence

We introduce a deep learning accelerated methodology to solve PDE-based Bayesian inverse problems with guaranteed accuracy. This is motivated by the ill-posed problem of inferring a spatio-temporal heat-flux parameter known as the Biot number given temperature data, however the methodology is generalisable to other settings. To accelerate Bayesian inference, we develop a novel training scheme that uses data to adaptively train a neural-network surrogate simulating the parametric forward model. By simultaneously identifying an approximate posterior distribution over the Biot number, and weighting a physics-informed training loss according to this, our approach approximates forward and inverse solution together without any need for external solves. Using a random Chebyshev series, we outline how to approximate a Gaussian process prior, and using the surrogate we apply Hamiltonian Monte Carlo (HMC) to sample from the posterior distribution. We derive convergence of the surrogate posterior to the true posterior distribution in the Hellinger metric as our adaptive loss approaches zero. Additionally, we describe how this surrogate-accelerated HMC approach can be combined with traditional PDE solvers in a delayed-acceptance scheme to a-priori control the posterior accuracy. This overcomes a major limitation of deep learning-based surrogate approaches, which do not achieve guaranteed accuracy a-priori due to their non-convex training. Biot number calculations are involved in turbo-machinery design, which is safety critical and highly regulated, therefore it is important that our results have such mathematical guarantees. Our approach achieves fast mixing in high dimensions whilst retaining the convergence guarantees of a traditional PDE solver, and without the burden of evaluating this solver for proposals that are likely to be rejected. Numerical results are given using real and simulated data.


Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood Estimation for Latent Gaussian Models

arXiv.org Artificial Intelligence

Latent Gaussian models have a rich history in statistics and machine learning, with applications ranging from factor analysis to compressed sensing to time series analysis. The classical method for maximizing the likelihood of these models is the expectation-maximization (EM) algorithm. For problems with high-dimensional latent variables and large datasets, EM scales poorly because it needs to invert as many large covariance matrices as the number of data points. We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversion. Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation. In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.