Goto

Collaborating Authors

 Bayesian Inference


Value of Information Analysis via Active Learning and Knowledge Sharing in Error-Controlled Adaptive Kriging

arXiv.org Machine Learning

Large uncertainties in many phenomena of interest have challenged the reliability of pertaining decisions. Collecting additional information to better characterize involved uncertainties is among decision alternatives. Value of information (VoI) analysis is a mathematical decision framework that quantifies expected potential benefits of new data and assists with optimal allocation of resources for information collection. However, a primary challenge facing VoI analysis is the very high computational cost of the underlying Bayesian inference especially for equality-type information. This paper proposes the first surrogate-based framework for VoI analysis. Instead of modeling the limit state functions describing events of interest for decision making, which is commonly pursued in surrogate model-based reliability methods, the proposed framework models system responses. This approach affords sharing equality-type information from observations among surrogate models to update likelihoods of multiple events of interest. Moreover, two knowledge sharing schemes called model and training points sharing are proposed to most effectively take advantage of the knowledge offered by costly model evaluations. Both schemes are integrated with an error rate-based adaptive training approach to efficiently generate accurate Kriging surrogate models. The proposed VoI analysis framework is applied for an optimal decision-making problem involving load testing of a truss bridge. While state-of-the-art methods based on importance sampling and adaptive Kriging Monte Carlo simulation are unable to solve this problem, the proposed method is shown to offer accurate and robust estimates of VoI with a limited number of model evaluations. Therefore, the proposed method facilitates the application of VoI for complex decision problems.


CausalNex: An open-source Python library that helps data scientists to infer causation rather than observing correlation MarkTechPost

#artificialintelligence

CausalNex is a Python library that allows data scientists and domain experts to co-develop models that go beyond correlation and consider causal relationships. 'CasualNex' provides a practical'what if' library which is deployed to test scenarios using Bayesian Networks (BNs). 'CasualNex' prepares practitioners to understand structural relationships from data and helps in the verification for accuracy of the relationships between different data sets. Apart from practitioners understanding the structural relationship from data, it also enables domain experts to fit conditional probability distributions and study the effect of potential interventions. 'CasualNex' helps to simplify the following steps: CausalNex is a Python package.


Semiparametric Bayesian Forecasting of Spatial Earthquake Occurrences

arXiv.org Machine Learning

Self-exciting Hawkes processes are used to model events which cluster in time and space, and have been widely studied in seismology under the name of the Epidemic Type Aftershock Sequence (ETAS) model. In the ETAS framework, the occurrence of the mainshock earthquakes in a geographical region is assumed to follow an inhomogeneous spatial point process, and aftershock events are then modelled via a separate triggering kernel. Most previous studies of the ETAS model have relied on point estimates of the model parameters due to the complexity of the likelihood function, and the difficulty in estimating an appropriate mainshock distribution. In order to take estimation uncertainty into account, we instead propose a fully Bayesian formulation of the ETAS model which uses a nonparametric Dirichlet process mixture prior to capture the spatial mainshock process. Direct inference for the resulting model is problematic due to the strong correlation of the parameters for the mainshock and triggering processes, so we instead use an auxiliary latent variable routine to perform efficient inference.


A Survey on Causal Inference

arXiv.org Artificial Intelligence

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.


Blind Spot Detection for Safe Sim-to-Real Transfer

Journal of Artificial Intelligence Research

Agents trained in simulation may make errors when performing actions in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult for the agent to discover because the agent is unable to predict them a priori. In this work, we propose the use of oracle feedback to learn a predictive model of these blind spots in order to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: when the agent lacks necessary features to represent the true state of the world, and thus cannot distinguish between numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. Our system learns models for predicting blind spots within unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. These models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach across two domains and demonstrate that it achieves higher predictive performance than baseline methods, and also that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how these biases influence the discovery of blind spots. Further, we include analyses of our approach that incorporate relaxed initial optimality assumptions. (Interestingly, relaxing the assumptions of an optimal oracle and an optimal simulator policy helped our models to perform better.) We also propose extensions to our method that are intended to improve performance when using corrections and demonstrations data.


Bayesian Networks in Healthcare: Distribution by Medical Condition

arXiv.org Artificial Intelligence

Bayesian networks (BNs) have received increasing research attention that is not matched by adoption in practice and yet have potential to significantly benefit healthcare. Hitherto, research works have not investigated the types of medical conditions being modelled with BNs, nor whether any differences exist in how and why they are applied to different conditions. This research seeks to identify and quantify the range of medical conditions for which healthcare-related BN models have been proposed, and the differences in approach between the most common medical conditions to which they have been applied. We found that almost two-thirds of all healthcare BNs are focused on four conditions: cardiac, cancer, psychological and lung disorders. We believe that a lack of understanding regarding how BNs work and what they are capable of exists, and that it is only with greater understanding and promotion that we may ever realise the full potential of BNs to effect positive change in daily healthcare practice.


Decoupling Learning Rates Using Empirical Bayes Priors

arXiv.org Machine Learning

In this work, we propose an Empirical Bayes approach to decouple the learning rates of first order and second order features (or any other feature grouping) in a Generalized Linear Model. Such needs arise in small-batch or low-traffic use-cases. As the first order features are likely to have a more pronounced effect on the outcome, focusing on learning first order weights first is likely to improve performance and convergence time. Our Empirical Bayes method clamps features in each group together and uses the observed data for the deployed model to empirically compute a hierarchical prior in hindsight. We apply our method to a standard classification setting, as well as a contextual bandit setting in an Amazon production system. Both during simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising, as optimizing over sparse data is often a challenge. Furthermore, our approach can be applied to any problem instance modeled as a Bayesian framework.


Multi-class Gaussian Process Classification with Noisy Inputs

arXiv.org Machine Learning

It is a common practice in the supervised machine learning community to assume that the observed data are noise-free in the input attributes. Nevertheless, scenarios with input noise are common in real problems, as measurements are never perfectly accurate. If this input noise is not taken into account, a supervised machine learning method is expected to perform sub-optimally. In this paper, we focus on multi-class classification problems and use Gaussian processes (GPs) as the underlying classifier. Motivated by a dataset coming from the astrophysics domain, we hypothesize that the observed data may contain noise in the inputs. Therefore, we devise several multi-class GP classifiers that can account for input noise. Such classifiers can be efficiently trained using variational inference to approximate the posterior distribution of the latent variables of the model. Moreover, in some situations, the amount of noise can be known before-hand. If this is the case, it can be readily introduced in the proposed methods. This prior information is expected to lead to better performance results. We have evaluated the proposed methods by carrying out several experiments, involving synthetic and real data. These data include several datasets from the UCI repository, the MNIST dataset and a dataset coming from astrophysics. The results obtained show that, although the classification error is similar across methods, the predictive distribution of the proposed methods is better, in terms of the test log-likelihood, than the predictive distribution of a classifier based on GPs that ignores input noise.


Understanding the dynamics of message passing algorithms: a free probability heuristics

arXiv.org Machine Learning

A major task is to compute statistics of unobserved random variables using distributions of these variables conditioned on observed data. An exact computation of the corresponding expectations in the multivariate case is usually not possible except for simple cases. Hence, one has to resort to methods which approximate the necessary high-dimensional sums or integrals and which are often based on ideas of statistical physics [1]. A class of such approximation algorithms is often termed message passing. Prominent examples are belief propagation [2] which was developed for inference in probabilistic Bayesian networks with sparse couplings and expectation propagation (EP) which is also applicable for networks with dense coupling matrices [3]. Both types of algorithms make assumptions on weak dependencies between random variables which motivate the approximation of certain expectations by Gaussian random variables invoking central limit theorem arguments [4]. Using ideas of the statistical physics of disordered systems, such arguments can be justified for the fixed points of such algorithms for large network models where couplings are drawn from random, rotation invariant matrix distributions. This extra assumption of randomness allows for further simplifications of message passing approaches [5, 6], leading e.g. to the approximate message passing AMP or VAMP algorithms, see [7, 8, 9].


Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections

arXiv.org Machine Learning

Human input has enabled autonomous systems to improve their capabilities and achieve complex behaviors that are otherwise challenging to generate automatically. Recent work focuses on how robots can use such input - like demonstrations or corrections - to learn intended objectives. These techniques assume that the human's desired objective already exists within the robot's hypothesis space. In reality, this assumption is often inaccurate: there will always be situations where the person might care about aspects of the task that the robot does not know about. Without this knowledge, the robot cannot infer the correct objective. Hence, when the robot's hypothesis space is misspecified, even methods that keep track of uncertainty over the objective fail because they reason about which hypothesis might be correct, and not whether any of the hypotheses are correct. In this paper, we posit that the robot should reason explicitly about how well it can explain human inputs given its hypothesis space and use that situational confidence to inform how it should incorporate human input. We demonstrate our method on a 7 degree-of-freedom robot manipulator in learning from two important types of human input: demonstrations of manipulation tasks, and physical corrections during the robot's task execution.