Goto

Collaborating Authors

 collider


Stable Blanket with Hidden Variables and Cycles

arXiv.org Machine Learning

Stabilized regression aims to identify a set of predictors whose conditional relationship with a response variable remains invariant across different environments. Existing graphical characterizations of the stable blanket are mainly developed for structural causal models (SCMs) without hidden variables or causal cycles. However, latent variables and feedback relationships naturally arise in many applications, and they can change both the Markov blanket and the set of predictors that remain stable under interventions. This paper studies stable blankets in graphical causal models with hidden variables, causal cycles, and both features simultaneously. For models with hidden variables, we use acyclic directed mixed graphs (ADMGs) and $m$-separation to characterize the Markov blanket and to construct intervention-stable predictor sets. We introduce the notion of an intervened sub-district and use it to describe how interventions may affect districts connected to the response. For models with cycles, we work with directed graphs (DGs) and directed mixed graphs (DMGs) together with $ฯƒ$-separation, treating strongly connected components (SCCs) as the basic graphical units. We then combine these ideas to analyze models with both hidden variables and cycles. The main results give graphical characterizations of Markov blankets, stable frontiers, and stable blankets in these generalized settings. In particular, we identify conditions under which the response is conditionally independent of intervention variables given a suitable predictor set, and we describe when such sets are minimal or unique. These results extend the graphical interpretation of stabilized regression beyond acyclic fully observed models.


Appendix ARemovable Variables

Neural Information Processing Systems

In this section, we first prove the proposed graphical representation for a removable variable in a MAGM (Theorem 1). Then, we discuss how this representation reduces to Theorem 5 of [11] in the case of DAGs. Throughout our proofs, we say a path between X and Y is blocked by a set Wif it is not m-connecting relative to W. In this case, there exists a non-collider W on the path which is a member of W, or there exists a collider W on the path such that W/2 Anc({X,Y }[ W). In both cases we say W blocks this path with respect to W, or W blocks the path in short when W is clear from the context. We say X is a descendant of Y if Y 2Anc(X), and we denote by DeM(X) the set of descendants of X in the MAGM, and De(X) whenever the graph is clear from the context. A.1 Graphical representation Theorem 1. Vertex X is removable in a MAGM over the variables V, if and only if 1. for any Y 2Adj(X) and Z 2Ch(X)[N(X)\{Y}, Y and Z are adjacent, and 2. for any collider path u =( X,V1,...,V m,Y) and Z 2 V\{X,Y,V1,...,V m} such that {X,V1,...,V m} Pa(Z), Y and Z are adjacent. Let H denote the induced subgraph of M over V\{X}. For any W V\{X,Y,Z}, (Z,X,Y) is an m-connecting path relative to W in M, as X is a non-collider and X/2W. That is, no such W can m-separate Y and Z. Since X is removable in M, by definition of removability, (Y?Z|W)M ()(Y?Z|W)H. Again for any W V\{X,Y,Z}, (Z,X,V1,...,V m,Y) is an m-connecting path relative to W in M since I) every collider on this path is a parent (and therefore an ancestor) of Z, and II) X/2W and X is the only non-collider on this path. That is, no such W can m-separate Y and Z. Since X is removable in M, Equation 8 implies that Y and Z have no m-separating sets in H. Hence, Y is adjacent to Z in H, and therefore, in M. if part: We need to prove that for any Y,Z 2V\{X} and any W V\{X,Y,Z}, (Y?Z|W)M ()(Y?Z|W)H.


Complete Causal Identification from Ancestral Graphs under Selection Bias

arXiv.org Machine Learning

Many causal discovery algorithms, including the celebrated FCI algorithm, output a Partial Ancestral Graph (PAG). PAGs serve as an abstract graphical representation of the underlying causal structure, modeled by directed acyclic graphs with latent and selection variables. This paper develops a characterization of the set of extended-type conditional independence relations that are invariant across all causal models represented by a PAG. This theory allows us to formulate a general measure-theoretic version of Pearl's causal calculus and a sound and complete identification algorithm for PAGs under selection bias. Our results also apply when PAGs are learned by certain algorithms that integrate observational data with experimental data and incorporate background knowledge.


On the Number of Conditional Independence Tests in Constraint-based Causal Discovery

arXiv.org Machine Learning

Learning causal relations from observational data is a fundamental problem with wide-ranging applications across many fields. Constraint-based methods infer the underlying causal structure by performing conditional independence tests. However, existing algorithms such as the prominent PC algorithm need to perform a large number of independence tests, which in the worst case is exponential in the maximum degree of the causal graph. Despite extensive research, it remains unclear if there exist algorithms with better complexity without additional assumptions. Here, we establish an algorithm that achieves a better complexity of $p^{\mathcal{O}(s)}$ tests, where $p$ is the number of nodes in the graph and $s$ denotes the maximum undirected clique size of the underlying essential graph. Complementing this result, we prove that any constraint-based algorithm must perform at least $2^{ฮฉ(s)}$ conditional independence tests, establishing that our proposed algorithm achieves exponent-optimality up to a logarithmic factor in terms of the number of conditional independence tests needed. Finally, we validate our theoretical findings through simulations, on semi-synthetic gene-expression data, and real-world data, demonstrating the efficiency of our algorithm compared to existing methods in terms of number of conditional independence tests needed.



A Broader impact

Neural Information Processing Systems

It is essential to approach the interpretation of our algorithm's results with caution and subject them to critical evaluation. In this section, we provide the definition of partial ancestral graphs (P AGs). A P AG shares the same adjacencies as any MAG in the observational equivalence class of MAGs. Section 2. For any v W, let G In this section, we derive the causal effect for the SMCM in Figure 3(top), i.e., (6), as well as prove D.1 Proof of (6) First, using the law of total probability, we have P(y |do (t = t)) = null Rule 3a, (c) follows from Rule 1, and (g) follows from Rule 2. D.2 Proof of Theorem 3.1 Lemma 1. Suppose Assumptions 1 to 3 hold. Given this claim, Theorem 3.1 follows from Tian and Pearl [2002, Theorem 4].