Bayesian Learning
Transport meets Variational Inference: Controlled Monte Carlo Diffusions
Vargas, Francisco, Padhy, Shreyas, Blessing, Denis, Nüsken, Nikolas
Connecting optimal transport and variational inference, we present a principled and systematic framework for sampling and generative modelling centred around divergences on path space. Our work culminates in the development of the \emph{Controlled Monte Carlo Diffusion} sampler (CMCD) for Bayesian computation, a score-based annealing technique that crucially adapts both forward and backward dynamics in a diffusion model. On the way, we clarify the relationship between the EM-algorithm and iterative proportional fitting (IPF) for Schr{\"o}dinger bridges, deriving as well a regularised objective that bypasses the iterative bottleneck of standard IPF-updates. Finally, we show that CMCD has a strong foundation in the Jarzinsky and Crooks identities from statistical physics, and that it convincingly outperforms competing approaches across a wide array of experiments.
Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees
Terenin, Alexander, Burt, David R., Artemev, Artem, Flaxman, Seth, van der Wilk, Mark, Rasmussen, Carl Edward, Ge, Hong
Gaussian processes are frequently deployed as part of larger machine learning and decision-making systems, for instance in geospatial modeling, Bayesian optimization, or in latent Gaussian models. Within a system, the Gaussian process model needs to perform in a stable and reliable manner to ensure it interacts correctly with other parts of the system. In this work, we study the numerical stability of scalable sparse approximations based on inducing points. To do so, we first review numerical stability, and illustrate typical situations in which Gaussian process models can be unstable. Building on stability theory originally developed in the interpolation literature, we derive sufficient and in certain cases necessary conditions on the inducing points for the computations performed to be numerically stable. For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions. This is done via a modification of the cover tree data structure, which is of independent interest. We additionally propose an alternative sparse approximation for regression with a Gaussian likelihood which trades off a small amount of performance to further improve stability. We provide illustrative examples showing the relationship between stability of calculations and predictive performance of inducing point methods on spatial tasks.
Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking
Du, Desong, Qi, Naiming, Liu, Yanfang, Pan, Wei
In the pursuit of autonomous spacecraft proximity maneuvers and docking(PMD), we introduce a novel Bayesian actor-critic reinforcement learning algorithm to learn a control policy with the stability guarantee. The PMD task is formulated as a Markov decision process that reflects the relative dynamic model, the docking cone and the cost function. Drawing from the principles of Lyapunov theory, we frame the temporal difference learning as a constrained Gaussian process regression problem. This innovative approach allows the state-value function to be expressed as a Lyapunov function, leveraging the Gaussian process and deep kernel learning. We develop a novel Bayesian quadrature policy optimization procedure to analytically compute the policy gradient while integrating Lyapunov-based stability constraints. This integration is pivotal in satisfying the rigorous safety demands of spaceflight missions. The proposed algorithm has been experimentally evaluated on a spacecraft air-bearing testbed and shows impressive and promising performance.
A multi-modal approach to continuous material identification through tactile sensing
Eguíluz, Augusto Gómez, Rañó, Ignacio, Coleman, Sonya A., McGinnity, T. Martin
Tactile sensing has recently been used in robotics for object identification, grasping, and material recognition. Most material recognition approaches use vibration information from a tactile exploration, typically above one second long, to identify the material. This work proposes a tactile multi-modal (vibration and thermal) material identification approach based on recursive Bayesian estimation. Through the frequency response of the vibration induced by the material and thermal features, like an estimate of the thermal power loss of the finger, we show that it is possible to identify materials in less than half a second. Moreover, a comparison between the use of vibration only and multi-modal identification shows that both recognition time and classification errors are reduced by adding thermal information.
Quantifying the value of information transfer in population-based SHM
Hughes, Aidan J., Poole, Jack, Dervilis, Nikolaos, Gardner, Paul, Worden, Keith
Population-based structural health monitoring (PBSHM), seeks to address some of the limitations associated with data scarcity that arise in traditional SHM. A tenet of the population-based approach to SHM is that information can be shared between sufficiently-similar structures in order to improve predictive models. Transfer learning techniques, such as domain adaptation, have been shown to be a highly-useful technology for sharing information between structures when developing statistical classifiers for PBSHM. Nonetheless, transfer-learning techniques are not without their pitfalls. In some circumstances, for example if the data distributions associated with the structures within a population are dissimilar, applying transfer-learning methods can be detrimental to classification performance -- this phenomenon is known as negative transfer. Given the potentially-severe consequences of negative transfer, it is prudent for engineers to ask the question `when, what, and how should one transfer between structures?'. The current paper aims to demonstrate a transfer-strategy decision process for a classification task for a population of simulated structures in the context of a representative SHM maintenance problem, supported by domain adaptation. The transfer decision framework is based upon the concept of expected value of information transfer. In order to compute the expected value of information transfer, predictions must be made regarding the classification (and decision performance) in the target domain following information transfer. In order to forecast the outcome of transfers, a probabilistic regression is used here to predict classification performance from a proxy for structural similarity based on the modal assurance criterion.
Multi-view 3D Object Reconstruction and Uncertainty Modelling with Neural Shape Prior
Liao, Ziwei, Waslander, Steven L.
3D object reconstruction is important for semantic scene understanding. It is challenging to reconstruct detailed 3D shapes from monocular images directly due to a lack of depth information, occlusion and noise. Most current methods generate deterministic object models without any awareness of the uncertainty of the reconstruction. We tackle this problem by leveraging a neural object representation which learns an object shape distribution from large dataset of 3d object models and maps it into a latent space. We propose a method to model uncertainty as part of the representation and define an uncertainty-aware encoder which generates latent codes with uncertainty directly from individual input images. Further, we propose a method to propagate the uncertainty in the latent code to SDF values and generate a 3d object mesh with local uncertainty for each mesh component. Finally, we propose an incremental fusion method under a Bayesian framework to fuse the latent codes from multi-view observations. We evaluate the system in both synthetic and real datasets to demonstrate the effectiveness of uncertainty-based fusion to improve 3D object reconstruction accuracy.
Spatial Process Approximations: Assessing Their Necessity
In spatial statistics and machine learning, the kernel matrix plays a pivotal role in prediction, classification, and maximum likelihood estimation. A thorough examination reveals that for large sample sizes, the kernel matrix becomes ill-conditioned, provided the sampling locations are fairly evenly distributed. This condition poses significant challenges to numerical algorithms used in prediction and estimation computations and necessitates an approximation to prediction and the Gaussian likelihood. A review of current methodologies for managing large spatial data indicates that some fail to address this ill-conditioning problem. Such ill-conditioning often results in low-rank approximations of the stochastic processes. This paper introduces various optimality criteria and provides solutions for each.
Looping in the Human: Collaborative and Explainable Bayesian Optimization
Adachi, Masaki, Planden, Brady, Howey, David A., Muandet, Krikamol, Osborne, Michael A., Chau, Siu Lun
Like many optimizers, Bayesian optimization often falls short of gaining user trust due to opacity. While attempts have been made to develop human-centric optimizers, they typically assume user knowledge is well-specified and error-free, employing users mainly as supervisors of the optimization process. We relax these assumptions and propose a more balanced human-AI partnership with our Collaborative and Explainable Bayesian Optimization (CoExBO) framework. Instead of explicitly requiring a user to provide a knowledge model, CoExBO employs preference learning to seamlessly integrate human insights into the optimization, resulting in algorithmic suggestions that resonate with user preference. CoExBO explains its candidate selection every iteration to foster trust, empowering users with a clearer grasp of the optimization. Furthermore, CoExBO offers a no-harm guarantee, allowing users to make mistakes; even with extreme adversarial interventions, the algorithm converges asymptotically to a vanilla Bayesian optimization. We validate CoExBO's efficacy through human-AI teaming experiments in lithium-ion battery design, highlighting substantial improvements over conventional methods.
Exact Bayesian Inference on Discrete Models via Probability Generating Functions: A Probabilistic Programming Approach
Zaiser, Fabian, Murawski, Andrzej S., Ong, Luke
We present an exact Bayesian inference method for discrete statistical models, which can find exact solutions to a large class of discrete inference problems, even with infinite support and continuous priors. To express such models, we introduce a probabilistic programming language that supports discrete and continuous sampling, discrete observations, affine functions, (stochastic) branching, and conditioning on discrete events. Our key tool is probability generating functions: they provide a compact closed-form representation of distributions that are definable by programs, thus enabling the exact computation of posterior probabilities, expectation, variance, and higher moments. Our inference method is provably correct and fully automated in a tool called Genfer, which uses automatic differentiation (specifically, Taylor polynomials), but does not require computer algebra. Our experiments show that Genfer is often faster than the existing exact inference tools PSI, Dice, and Prodigy. On a range of real-world inference problems that none of these exact tools can solve, Genfer's performance is competitive with approximate Monte Carlo methods, while avoiding approximation errors.
Optimizing Implicit Neural Representations from Point Clouds via Energy-Based Models
Yamauchi, Ryutaro, Sakurai, Jinya, Furukawa, Ryo, Matsubayashi, Tatsushi
Reconstructing a continuous surface from an unoritented 3D point cloud is a fundamental task in 3D shape processing. In recent years, several methods have been proposed to address this problem using implicit neural representations (INRs). In this study, we propose a method to optimize INRs using energy-based models (EBMs). By employing the absolute value of the coordinate-based neural networks as the energy function, the INR can be optimized through the estimation of the point cloud distribution by the EBM. In addition, appropriate parameter settings of the EBM enable the model to consider the magnitude of point cloud noise. Our experiments confirmed that the proposed method is more robust against point cloud noise than conventional surface reconstruction methods.