The implication problem of probabilistic conditional independencies is investigated in the presence of missing data. Here, graph separation axioms fail to hold for saturated conditional independencies, unlike the known idealized case with no missing data. Several axiomatic, algorithmic, and logical characterizations of the implication problem for saturated conditional independencies are established. In particular, equivalences are shown to the implication problem of a propositional fragment under Levesque's situations, and that of Lien's class of multivalued database dependencies under null values.
The construction of causal graphs from non-experimental data rests on a set of constraints that the graph structure imposes on all probability distributions compatible with the graph. These constraints are of two types: conditional independencies and algebraic constraints, first noted by Verma. While conditional independencies are well studied and frequently used in causal induction algorithms, Verma constraints are still poorly understood, and rarely applied. In this paper we examine a special subset of Verma constraints which are easy to understand, easy to identify and easy to apply; they arise from "dormant independencies," namely, conditional independencies that hold in interventional distributions. We give a complete algorithm for determining if a dormant independence between two sets of variables is entailed by the causal graph, such that this independence is identifiable, in other words if it resides in an interventional distribution that can be predicted without resorting to interventions. We further show the usefulness of dormant independencies in model testing and induction by giving an algorithm that uses constraints entailed by dormant independencies to prune extraneous edges from a given causal graph.
Conditional independence structures describe independencies of one set of variables from another set of variables conditioned upon a third set of variables. These structures are invaluable means for compact representations of knowledge because independencies can be exploited for useful factorizations. Conditional independence structures appear in different disguise in various areas of knowledge representation, be it the conditional independence of sets of random variables in probabilistic graphical models such as Bayesian networks or as conditional functions related to belief revision, or as in- dependencies induced by (embedded) multivalued dependencies in data bases. This paper investigates conditional independencies for Boolean functions using Fourier analysis. We define three notions of independence based on the notion of influence of a variable on a function and draw connections to multivalued dependencies.
Bayesian networks, and especially their structures, are powerful tools for representing conditional independencies and dependencies between random variables. In applications where related variables form a priori known groups, chosen to represent different "views" to or aspects of the same entities, one may be more interested in modeling dependencies between groups of variables rather than between individual variables. Motivated by this, we study prospects of representing relationships between variable groups using Bayesian network structures. We show that for dependency structures between groups to be expressible exactly, the data have to satisfy the so-called groupwise faithfulness assumption. We also show that one cannot learn causal relations between groups using only groupwise conditional independencies, but also variable-wise relations are needed. Additionally, we present algorithms for finding the groupwise dependency structures.
Methods based on additive-noise (AN) models have been proposed to further discriminate between causal structures that are equivalent in terms of conditional independencies. These methods rely on a particular form of the generative functional equations, with an additive noise structure, which allows inferring the directionality of causation by testing the independence between the residuals of a nonlinear regression and the predictors (nrr-independencies). Full causal structure identifiability has been proven for systems that contain only additive-noise equations and have no hidden variables. We extend the AN framework in several ways. We introduce alternative regression-free tests of independence based on conditional variances (cv-independencies). We consider conditionally-additive-noise (CAN) models, in which the equations may have the AN form only after conditioning. We exploit asymmetries in nrr-independencies or cv-independencies resulting from the CAN form to derive a criterion that infers the causal relation between a pair of variables in a multivariate system without any assumption about the form of the equations or the presence of hidden variables.