Goto

Collaborating Authors

 Bayesian Inference


A Full DAG Score-Based Algorithm for Learning Causal Bayesian Networks with Latent Confounders

arXiv.org Artificial Intelligence

Causal Bayesian networks (CBN) are popular graphical probabilistic models that encode causal relations among variables. Learning their graphical structure from observational data has received a lot of attention in the literature. When there exists no latent (unobserved) confounder, i.e., no unobserved direct common cause of some observed variables, learning algorithms can be divided essentially into two classes: constraint-based and score-based approaches. The latter are often thought to be more robust than the former and to produce better results. However, to the best of our knowledge, when variables are discrete, no score-based algorithm is capable of dealing with latent confounders. This paper introduces the first fully score-based structure learning algorithm searching the space of DAGs (directed acyclic graphs) that is capable of identifying the presence of some latent confounders. It is justified mathematically and experiments highlight its effectiveness.


No Need to Sacrifice Data Quality for Quantity: Crowd-Informed Machine Annotation for Cost-Effective Understanding of Visual Data

arXiv.org Artificial Intelligence

Labeling visual data is expensive and time-consuming. Crowdsourcing systems promise to enable highly parallelizable annotations through the participation of monetarily or otherwise motivated workers, but even this approach has its limits. The solution: replace manual work with machine work. But how reliable are machine annotators? Sacrificing data quality for high throughput cannot be acceptable, especially in safety-critical applications such as autonomous driving. In this paper, we present a framework that enables quality checking of visual data at large scales without sacrificing the reliability of the results. We ask annotators simple questions with discrete answers, which can be highly automated using a convolutional neural network trained to predict crowd responses. Unlike the methods of previous work, which aim to directly predict soft labels to address human uncertainty, we use per-task posterior distributions over soft labels as our training objective, leveraging a Dirichlet prior for analytical accessibility. We demonstrate our approach on two challenging real-world automotive datasets, showing that our model can fully automate a significant portion of tasks, saving costs in the high double-digit percentage range. Our model reliably predicts human uncertainty, allowing for more accurate inspection and filtering of difficult examples. Additionally, we show that the posterior distributions over soft labels predicted by our model can be used as priors in further inference processes, reducing the need for numerous human labelers to approximate true soft labels accurately. This results in further cost reductions and more efficient use of human resources in the annotation process.


Approximate Estimation of High-dimension Execution Skill for Dynamic Agents in Continuous Domains

arXiv.org Artificial Intelligence

In many real-world continuous action domains, human agents must decide which actions to attempt and then execute those actions to the best of their ability. However, humans cannot execute actions without error. Human performance in these domains can potentially be improved by the use of AI to aid in decision-making. One requirement for an AI to correctly reason about what actions a human agent should attempt is a correct model of that human's execution error, or skill. Recent work has demonstrated successful techniques for estimating this execution error with various types of agents across different domains. However, this previous work made several assumptions that limit the application of these ideas to real-world settings. First, previous work assumed that the error distributions were symmetric normal, which meant that only a single parameter had to be estimated. In reality, agent error distributions might exhibit arbitrary shapes and should be modeled more flexibly. Second, it was assumed that the execution error of the agent remained constant across all observations. Especially for human agents, execution error changes over time, and this must be taken into account to obtain effective estimates. To overcome both of these shortcomings, we propose a novel particle-filter-based estimator for this problem. After describing the details of this approximate estimator, we experimentally explore various design decisions and compare performance with previous skill estimators in a variety of settings to showcase the improvements. The outcome is an estimator capable of generating more realistic, time-varying execution skill estimates of agents, which can then be used to assist agents in making better decisions and improve their overall performance.


Value-Enriched Population Synthesis: Integrating a Motivational Layer

arXiv.org Artificial Intelligence

In recent years, computational improvements have allowed for more nuanced, data-driven and geographically explicit agent-based simulations. So far, simulations have struggled to adequately represent the attributes that motivate the actions of the agents. In fact, existing population synthesis frameworks generate agent profiles limited to socio-demographic attributes. In this paper, we introduce a novel value-enriched population synthesis framework that integrates a motivational layer with the traditional individual and household socio-demographic layers. Our research highlights the significance of extending the profile of agents in synthetic populations by incorporating data on values, ideologies, opinions and vital priorities, which motivate the agents' behaviour. This motivational layer can help us develop a more nuanced decision-making mechanism for the agents in social simulation settings. Our methodology integrates microdata and macrodata within different Bayesian network structures. This contribution allows to generate synthetic populations with integrated value systems that preserve the inherent socio-demographic distributions of the real population in any specific region.


A Likelihood-Free Approach to Goal-Oriented Bayesian Optimal Experimental Design

arXiv.org Machine Learning

Conventional Bayesian optimal experimental design seeks to maximize the expected information gain (EIG) on model parameters. However, the end goal of the experiment often is not to learn the model parameters, but to predict downstream quantities of interest (QoIs) that depend on the learned parameters. And designs that offer high EIG for parameters may not translate to high EIG for QoIs. Goal-oriented optimal experimental design (GO-OED) thus directly targets to maximize the EIG of QoIs. We introduce LF-GO-OED (likelihood-free goal-oriented optimal experimental design), a computational method for conducting GO-OED with nonlinear observation and prediction models. LF-GO-OED is specifically designed to accommodate implicit models, where the likelihood is intractable. In particular, it builds a density ratio estimator from samples generated from approximate Bayesian computation (ABC), thereby sidestepping the need for likelihood evaluations or density estimations. The overall method is validated on benchmark problems with existing methods, and demonstrated on scientific applications of epidemiology and neural science.


Improvement of Bayesian PINN Training Convergence in Solving Multi-scale PDEs with Noise

arXiv.org Artificial Intelligence

Bayesian Physics Informed Neural Networks (BPINN) have received considerable attention for inferring differential equations' system states and physical parameters according to noisy observations. However, in practice, Hamiltonian Monte Carlo (HMC) used to estimate the internal parameters of BPINN often encounters troubles, including poor performance and awful convergence for a given step size used to adjust the momentum of those parameters. To improve the efficacy of HMC convergence for the BPINN method and extend its application scope to multi-scale partial differential equations (PDE), we developed a robust multi-scale Bayesian PINN (dubbed MBPINN) method by integrating multi-scale deep neural networks (MscaleDNN) and Bayesian inference. In this newly proposed MBPINN method, we reframe HMC with Stochastic Gradient Descent (SGD) to ensure the most ``likely'' estimation is always provided, and we configure its solver as a Fourier feature mapping-induced MscaleDNN. The MBPINN method offers several key advantages: (1) it is more robust than HMC, (2) it incurs less computational cost than HMC, and (3) it is more flexible for complex problems. We demonstrate the applicability and performance of the proposed method through general Poisson and multi-scale elliptic problems in one- to three-dimensional spaces. Our findings indicate that the proposed method can avoid HMC failures and provide valid results. Additionally, our method can handle complex PDE and produce comparable results for general PDE. These findings suggest that our proposed approach has excellent potential for physics-informed machine learning for parameter estimation and solution recovery in the case of ill-posed problems.


Misclassification excess risk bounds for PAC-Bayesian classification via convexified loss

arXiv.org Machine Learning

PAC-Bayesian bounds have proven to be a valuable tool for deriving generalization bounds and for designing new learning algorithms in machine learning. However, it typically focus on providing generalization bounds with respect to a chosen loss function. In classification tasks, due to the non-convex nature of the 0-1 loss, a convex surrogate loss is often used, and thus current PAC-Bayesian bounds are primarily specified for this convex surrogate. This work shifts its focus to providing misclassification excess risk bounds for PAC-Bayesian classification when using a convex surrogate loss. Our key ingredient here is to leverage PAC-Bayesian relative bounds in expectation rather than relying on PAC-Bayesian bounds in probability. We demonstrate our approach in several important applications.


InVAErt networks for amortized inference and identifiability analysis of lumped parameter hemodynamic models

arXiv.org Artificial Intelligence

Estimation of cardiovascular model parameters from electronic health records (EHR) poses a significant challenge primarily due to lack of identifiability. Structural non-identifiability arises when a manifold in the space of parameters is mapped to a common output, while practical non-identifiability can result due to limited data, model misspecification, or noise corruption. To address the resulting ill-posed inverse problem, optimization-based or Bayesian inference approaches typically use regularization, thereby limiting the possibility of discovering multiple solutions. In this study, we use inVAErt networks, a neural network-based, data-driven framework for enhanced digital twin analysis of stiff dynamical systems. We demonstrate the flexibility and effectiveness of inVAErt networks in the context of physiological inversion of a six-compartment lumped parameter hemodynamic model from synthetic data to real data with missing components.


Adaptation of uncertainty-penalized Bayesian information criterion for parametric partial differential equation discovery

arXiv.org Artificial Intelligence

Data-driven discovery of partial differential equations (PDEs) has emerged as a promising approach for deriving governing physics when domain knowledge about observed data is limited. Despite recent progress, the identification of governing equations and their parametric dependencies using conventional information criteria remains challenging in noisy situations, as the criteria tend to select overly complex PDEs. In this paper, we introduce an extension of the uncertainty-penalized Bayesian information criterion (UBIC), which is adapted to solve parametric PDE discovery problems efficiently without requiring computationally expensive PDE simulations. This extended UBIC uses quantified PDE uncertainty over different temporal or spatial points to prevent overfitting in model selection. The UBIC is computed with data transformation based on power spectral densities to discover the governing parametric PDE that truly captures qualitative features in frequency space with a few significant terms and their parametric dependencies (i.e., the varying PDE coefficients), evaluated with confidence intervals. Numerical experiments on canonical PDEs demonstrate that our extended UBIC can identify the true number of terms and their varying coefficients accurately, even in the presence of noise. The code is available at \url{https://github.com/Pongpisit-Thanasutives/parametric-discovery}.


BINDy -- Bayesian identification of nonlinear dynamics with reversible-jump Markov-chain Monte-Carlo

arXiv.org Artificial Intelligence

Model parsimony is an important \emph{cognitive bias} in data-driven modelling that aids interpretability and helps to prevent over-fitting. Sparse identification of nonlinear dynamics (SINDy) methods are able to learn sparse representations of complex dynamics directly from data, given a basis of library functions. In this work, a novel Bayesian treatment of dictionary learning system identification, as an alternative to SINDy, is envisaged. The proposed method -- Bayesian identification of nonlinear dynamics (BINDy) -- is distinct from previous approaches in that it targets the full joint posterior distribution over both the terms in the library and their parameterisation in the model. This formulation confers the advantage that an arbitrary prior may be placed over the model structure to produce models that are sparse in the model space rather than in parameter space. Because this posterior is defined over parameter vectors that can change in dimension, the inference cannot be performed by standard techniques. Instead, a Gibbs sampler based on reversible-jump Markov-chain Monte-Carlo is proposed. BINDy is shown to compare favourably to ensemble SINDy in three benchmark case-studies. In particular, it is seen that the proposed method is better able to assign high probability to correct model terms.