Causal Inference Using Machine Learning

#artificialintelligence

Finding causal relationships is the most profound motivation behind research in various disciplines. However, our toolbox of computational methods for causal inference has remained limited for a long time. Better computational methods for causal inference can change the way that we conduct research and have the potential to revolutionize science by expediting scientific discoveries and cutting costs.


Bayesian Nonparametric Causal Inference: Information Rates and Learning Algorithms

arXiv.org Machine Learning

We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data, this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman Rubin potential outcomes model, we use the Kullback Leibler (KL) divergence between the estimated and true distributions as a measure of accuracy of the estimate, and we define the information rate of the Bayesian causal inference procedure as the (asymptotic equivalence class of the) expected value of the KL divergence between the estimated and true distributions as a function of the number of samples. Using Fano method, we establish a fundamental limit on the information rate that can be achieved by any Bayesian estimator, and show that this fundamental limit is independent of the selection bias in the observational data. We characterize the Bayesian priors on the potential (factual and counterfactual) outcomes that achieve the optimal information rate. As a consequence, we show that a particular class of priors that have been widely used in the causal inference literature cannot achieve the optimal information rate. On the other hand, a broader class of priors can achieve the optimal information rate. We go on to propose a prior adaptation procedure (which we call the information based empirical Bayes procedure) that optimizes the Bayesian prior by maximizing an information theoretic criterion on the recovered causal effects rather than maximizing the marginal likelihood of the observed (factual) data. Building on our analysis, we construct an information optimal Bayesian causal inference algorithm.


Causal programming: inference with structural causal models as finding instances of a relation

arXiv.org Artificial Intelligence

This paper proposes a causal inference relation and causal programming as general frameworks for causal inference with structural causal models. A tuple, $\langle M, I, Q, F \rangle$, is an instance of the relation if a formula, $F$, computes a causal query, $Q$, as a function of known population probabilities, $I$, in every model entailed by a set of model assumptions, $M$. Many problems in causal inference can be viewed as the problem of enumerating instances of the relation that satisfy given criteria. This unifies a number of previously studied problems, including causal effect identification, causal discovery and recovery from selection bias. In addition, the relation supports formalizing new problems in causal inference with structural causal models, such as the problem of research design. Causal programming is proposed as a further generalization of causal inference as the problem of finding optimal instances of the relation, with respect to a cost function.


A Bayesian Approach for Inferring Local Causal Structure in Gene Regulatory Networks

arXiv.org Machine Learning

Gene regulatory networks play a crucial role in controlling an organism's biological processes, which is why there is significant interest in developing computational methods that are able to extract their structure from high-throughput genetic data. A typical approach consists of a series of conditional independence tests on the covariance structure meant to progressively reduce the space of possible causal models. We propose a novel efficient Bayesian method for discovering the local causal relationships among triplets of (normally distributed) variables. In our approach, we score the patterns in the covariance matrix in one go and we incorporate the available background knowledge in the form of priors over causal structures. Our method is flexible in the sense that it allows for different types of causal structures and assumptions. We apply the approach to the task of inferring gene regulatory networks by learning regulatory relationships between gene expression levels. We show that our algorithm produces stable and conservative posterior probability estimates over local causal structures that can be used to derive an honest ranking of the most meaningful regulatory relationships. We demonstrate the stability and efficacy of our method both on simulated data and on real-world data from an experiment on yeast.