Goto

Collaborating Authors

 Bayesian Inference


On the hierarchical Bayesian modelling of frequency response functions

arXiv.org Artificial Intelligence

For situations that may benefit from information sharing among datasets, e.g., population-based SHM of similar structures, the hierarchical Bayesian approach provides a useful modelling structure. Hierarchical Bayesian models learn statistical distributions at the population (or parent) and the domain levels simultaneously, to bolster statistical strength among the parameters. As a result, variance is reduced among the parameter estimates, particularly when data are limited. In this paper, a combined probabilistic FRF model is developed for a small population of nominally-identical helicopter blades, using a hierarchical Bayesian structure, to support information transfer in the context of sparse data. The modelling approach is also demonstrated in a traditional SHM context, for a single helicopter blade exposed to varying temperatures, to show how the inclusion of physics-based knowledge can improve generalisation beyond the training data, in the context of scarce data. These models address critical challenges in SHM, by accommodating benign variations that present as differences in the underlying dynamics, while also considering (and utilising), the similarities among the domains.


SLEM: Machine Learning for Path Modeling and Causal Inference with Super Learner Equation Modeling

arXiv.org Machine Learning

Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions regarding the predictions of hypothetical interventions using observational data. Path models, Structural Equation Models (SEMs), and, more generally, Directed Acyclic Graphs (DAGs), provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon. Unlike DAGs, which make very few assumptions about the functional and parametric form, SEM assumes linearity. This can result in functional misspecification which prevents researchers from undertaking reliable effect size estimation. In contrast, we propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles. We empirically demonstrate its ability to provide consistent and unbiased estimates of causal effects, its competitive performance for linear models when compared with SEM, and highlight its superiority over SEM when dealing with non-linear relationships. We provide open-source code, and a tutorial notebook with example usage, accentuating the easy-to-use nature of the method.


PAC-Bayesian Domain Adaptation Bounds for Multi-view learning

arXiv.org Machine Learning

This paper presents a series of new results for domain adaptation in the multi-view learning setting. The incorporati on of multiple views in the domain adaptation was paid little attention in t he previous studies. In this way, we propose an analysis of generaliz ation bounds with Pac-Bayesian theory to consolidate the two paradigms, which are currently treated separately. Firstly, building on previo us work by Ger-main et al. [7,8], we adapt the distance between distributio n proposed by Germain et al. for domain adaptation with the concept of mu lti-view learning. Thus, we introduce a novel distance that is ta ilored for the multi-view domain adaptation setting. Then, we give Pac -Bayesian bounds for estimating the introduced divergence. Finally, we compare the different new bounds with the previous studies.


Data-driven Modeling and Inference for Bayesian Gaussian Process ODEs via Double Normalizing Flows

arXiv.org Machine Learning

Recently, Gaussian processes have been used to model the vector field of continuous dynamical systems, referred to as GPODEs, which are characterized by a probabilistic ODE equation. Bayesian inference for these models has been extensively studied and applied in tasks such as time series prediction. However, the use of standard GPs with basic kernels like squared exponential kernels has been common in GPODE research, limiting the model's ability to represent complex scenarios. To address this limitation, we introduce normalizing flows to reparameterize the ODE vector field, resulting in a data-driven prior distribution, thereby increasing flexibility and expressive power. We develop a data-driven variational learning algorithm that utilizes analytically tractable probability density functions of normalizing flows, enabling simultaneous learning and inference of unknown continuous dynamics. Additionally, we also apply normalizing flows to the posterior inference of GP ODEs to resolve the issue of strong mean-field assumptions in posterior inference. By applying normalizing flows in both these ways, our model improves accuracy and uncertainty estimates for Bayesian Gaussian Process ODEs. We validate the effectiveness of our approach on simulated dynamical systems and real-world human motion data, including time series prediction and missing data recovery tasks. Experimental results show that our proposed method effectively captures model uncertainty while improving accuracy.


PAC-Bayes-Chernoff bounds for unbounded losses

arXiv.org Machine Learning

We present a new high-probability PAC-Bayes oracle bound for unbounded losses. This result can be understood as a PAC-Bayes version of the Chernoff bound. The proof technique relies on uniformly bounding the tail of certain random variable based on the Cram\'er transform of the loss. We highlight two applications of our main result. First, we show that our bound solves the open problem of optimizing the free parameter on many PAC-Bayes bounds. Finally, we show that our approach allows working with flexible assumptions on the loss function, resulting in novel bounds that generalize previous ones and can be minimized to obtain Gibbs-like posteriors.


Whole-examination AI estimation of fetal biometrics from 20-week ultrasound scans

arXiv.org Artificial Intelligence

The current approach to fetal anomaly screening is based on biometric measurements derived from individually selected ultrasound images. In this paper, we introduce a paradigm shift that attains human-level performance in biometric measurement by aggregating automatically extracted biometrics from every frame across an entire scan, with no need for operator intervention. We use a convolutional neural network to classify each frame of an ultrasound video recording. We then measure fetal biometrics in every frame where appropriate anatomy is visible. We use a Bayesian method to estimate the true value of each biometric from a large number of measurements and probabilistically reject outliers. We performed a retrospective experiment on 1457 recordings (comprising 48 million frames) of 20-week ultrasound scans, estimated fetal biometrics in those scans and compared our estimates to the measurements sonographers took during the scan. Our method achieves human-level performance in estimating fetal biometrics and estimates well-calibrated credible intervals in which the true biometric value is expected to lie.


AI Alignment: A Comprehensive Survey

arXiv.org Artificial Intelligence

AI alignment aims to make AI systems behave in line with human intentions and values. As AI systems grow more capable, so do risks from misalignment. To provide a comprehensive and up-to-date overview of the alignment field, in this survey, we delve into the core concepts, methodology, and practice of alignment. First, we identify four principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality (RICE). Guided by these four principles, we outline the landscape of current alignment research and decompose them into two key components: forward alignment and backward alignment. The former aims to make AI systems aligned via alignment training, while the latter aims to gain evidence about the systems' alignment and govern them appropriately to avoid exacerbating misalignment risks. On forward alignment, we discuss techniques for learning from feedback and learning under distribution shift. On backward alignment, we discuss assurance techniques and governance practices. We also release and continually update the website (www.alignmentsurvey.com) which features tutorials, collections of papers, blog posts, and other resources.


Transfer Learning for Causal Effect Estimation

arXiv.org Machine Learning

We present a Transfer Causal Learning (TCL) framework when target and source domains share the same covariate/feature spaces, aiming to improve causal effect estimation accuracy in limited data. Limited data is very common in medical applications, where some rare medical conditions, such as sepsis, are of interest. Our proposed method, named \texttt{$\ell_1$-TCL}, incorporates $\ell_1$ regularized TL for nuisance models (e.g., propensity score model); the TL estimator of the nuisance parameters is plugged into downstream average causal/treatment effect estimators (e.g., inverse probability weighted estimator). We establish non-asymptotic recovery guarantees for the \texttt{$\ell_1$-TCL} with generalized linear model (GLM) under the sparsity assumption in the high-dimensional setting, and demonstrate the empirical benefits of \texttt{$\ell_1$-TCL} through extensive numerical simulation for GLM and recent neural network nuisance models. Our method is subsequently extended to real data and generates meaningful insights consistent with medical literature, a case where all baseline methods fail.


Decision-Focused Model-based Reinforcement Learning for Reward Transfer

arXiv.org Artificial Intelligence

Decision-focused (DF) model-based reinforcement learning has recently been introduced as a powerful algorithm that can focus on learning the MDP dynamics that are most relevant for obtaining high returns. While this approach increases the agent's performance by directly optimizing the reward, it does so by learning less accurate dynamics from a maximum likelihood perspective. We demonstrate that when the reward function is defined by preferences over multiple objectives, the DF model may be sensitive to changes in the objective preferences.In this work, we develop the robust decision-focused (RDF) algorithm, which leverages the non-identifiability of DF solutions to learn models that maximize expected returns while simultaneously learning models that transfer to changes in the preference over multiple objectives. We demonstrate the effectiveness of RDF on two synthetic domains and two healthcare simulators, showing that it significantly improves the robustness of DF model learning to changes in the reward function without compromising training-time return.


A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry

arXiv.org Machine Learning

Bayesian neural networks (BNNs) are a principled approach to modeling predictive uncertainties in deep learning, which are important in safety-critical applications. Since exact Bayesian inference over the weights in a BNN is intractable, various approximate inference methods exist, among which sampling methods such as Hamiltonian Monte Carlo (HMC) are often considered the gold standard. While HMC provides high-quality samples, it lacks interpretable summary statistics because its sample mean and variance is meaningless in neural networks due to permutation symmetry. In this paper, we first show that the role of permutations can be meaningfully quantified by a number of transpositions metric. We then show that the recently proposed rebasin method [1] allows us to summarize HMC samples into a compact representation that provides a meaningful explicit uncertainty estimate for each weight in a neural network, thus unifying sampling methods with variational inference. We show that this compact representation allows us to compare trained BNNs directly in weight space across sampling methods and variational inference, and to efficiently prune neural networks trained without explicit Bayesian frameworks by exploiting uncertainty estimates from HMC.