Pearl, Judea
Transportability from Multiple Environments with Limited Experiments
Bareinboim, Elias, Lee, Sanghack, Honavar, Vasant, Pearl, Judea
This paper considers the problem of transferring experimental findings learned from multiple heterogeneous domains to a target environment, in which only limited experiments can be performed. We reduce questions of transportability from multiple domains and with limited scope to symbolic derivations in the do-calculus, thus extending the treatment of transportability from full experiments introduced in Pearl and Bareinboim (2011). We further provide different graphical and algorithmic conditions for computing the transport formula for this setting, that is, a way of fusing the observational and experimental information scattered throughout different domains to synthesize a consistent estimate of the desired effects.
Graphical Models for Inference with Missing Data
Mohan, Karthika, Pearl, Judea, Tian, Jin
We address the problem of deciding whether there exists a consistent estimator of a given relation Q, when data are missing not at random. We employ a formal representation called `Missingness Graphs' to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies between these mechanisms and the variables being measured. Using this representation, we define the notion of \textit{recoverability} which ensures that, for a given missingness-graph $G$ and a given query $Q$ an algorithm exists such that in the limit of large samples, it produces an estimate of $Q$ \textit{as if} no data were missing. We further present conditions that the graph should satisfy in order for recoverability to hold and devise algorithms to detect the presence of these conditions.
A General Algorithm for Deciding Transportability of Experimental Results
Bareinboim, Elias, Pearl, Judea
Generalizing empirical findings to new environments, settings, or populations is essential in most scientific explorations. This article treats a particular problem of generalizability, called "transportability", defined as a license to transfer information learned in experimental studies to a different population, on which only observational studies can be conducted. Given a set of assumptions concerning commonalities and differences between the two populations, Pearl and Bareinboim (2011) derived sufficient conditions that permit such transfer to take place. This article summarizes their findings and supplements them with an effective procedure for deciding when and how transportability is feasible. It establishes a necessary and sufficient condition for deciding when causal effects in the target population are estimable from both the statistical information available and the causal information transferred from the experiments. The article further provides a complete algorithm for computing the transport formula, that is, a way of combining observational and experimental information to synthesize bias-free estimate of the desired causal relation. Finally, the article examines the differences between transportability and other variants of generalizability.
Causal Transportability with Limited Experiments
Bareinboim, Elias (University of California, Los Angeles) | Pearl, Judea (University of California, Los Angeles)
We address the problem of transferring causal knowledge learned in one environment to another, potentially different environment, when only limited experiments may be conducted at the source. This generalizes the treatment of transportability introduced in [Pearl and Bareinboim, 2011; Bareinboim and Pearl, 2012b], which deals with transferring causal information when any experiment can be conducted at the source. Given that it is not always feasible to conduct certain controlled experiments, we consider the decision problem whether experiments on a selected subset Z of variables together with qualitative assumptions encoded in a diagram may render causal effects in the target environment computable from the available data. This problem, which we call z-transportability, reduces to ordinary transportability when Z is all-inclusive, and, like the latter, can be given syntactic characterization using the do-calculus [Pearl, 1995; 2000]. This paper establishes a necessary and sufficient condition for causal effects in the target domain to be estimable from both the non-experimental information available and the limited experimental information transferred from the source. We further provides a complete algorithm for computing the transport formula, that is, a way of fusing experimental and observational information to synthesize an unbiased estimate of the desired causal relation.
The Recovery of Causal Poly-Trees from Statistical Data
Rebane, George, Pearl, Judea
Poly-trees are singly connected causal networks in which variables may arise from multiple causes. This paper develops a method of recovering ply-trees from empirically measured probability distributions of pairs of variables. The method guarantees that, if the measured distributions are generated by a causal process structured as a ply-tree then the topological structure of such tree can be recovered precisely and, in addition, the causal directionality of the branches can be determined up to the maximum extent possible. The method also pinpoints the minimum (if any) external semantics required to determine the causal relationships among the variables considered.
Learning Link-Probabilities in Causal Trees
Roizer, Igor, Pearl, Judea
A learning algorithm is presented which given the structure of a causal tree, will estimate its link probabilities by sequential measurements on the leaves only. Internal nodes of the tree represent conceptual (hidden) variables inaccessible to observation. The method described is incremental, local, efficient, and remains robust to measurement imprecisions.
Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
Pearl, Judea
The apparent failure of individual probabilistic expressions to distinguish uncertainty about truths from uncertainty about probabilistic assessments have prompted researchers to seek formalisms where the two types of uncertainties are given notational distinction. This paper demonstrates that the desired distinction is already a built-in feature of classical probabilistic models, thus, specialized notations are unnecessary.
On the Equivalence of Causal Models
Verma, Tom S., Pearl, Judea
Scientists often use directed acyclic graphs (days) to model the qualitative structure of causal theories, allowing the parameters to be estimated from observational data. Two causal models are equivalent if there is no experiment which could distinguish one from the other. A canonical representation for causal models is presented which yields an efficient graphical criterion for deciding equivalence, and provides a theoretical basis for extracting causal structures from empirical data. This representation is then extended to the more general case of an embedded causal model, that is, a dag in which only a subset of the variables are observable. The canonical representation presented here yields an efficient algorithm for determining when two embedded causal models reflect the same dependency information. This algorithm leads to a model theoretic definition of causation in terms of statistical dependencies.
Structuring Causal Tree Models with Continuous Variables
Xu, Lei, Pearl, Judea
This paper considers the problem of invoking auxiliary, unobservable variables to facilitate the structuring of causal tree models for a given set of continuous variables. Paralleling the treatment of bi-valued variables in [Pearl 1986], we show that if a collection of coupled variables are governed by a joint normal distribution and a tree-structured representation exists, then both the topology and all internal relationships of the tree can be uncovered by observing pairwise dependencies among the observed variables (i.e., the leaves of the tree). Furthermore, the conditions for normally distributed variables are less restrictive than those governing bi-valued variables. The result extends the applications of causal tree models which were found useful in evidential reasoning tasks.
d-Separation: From Theorems to Algorithms
Geiger, Dan, Verma, Tom S., Pearl, Judea
An efficient algorithm is developed that identifies all independencies implied by the topology of a Bayesian network. Its correctness and maximality stems from the soundness and completeness of d-separation with respect to probability theory. The algorithm runs in time O (l E l) where E is the number of edges in the network.