Causal graphs, such as directed acyclic graphs (DAGs) and partial ancestral graphs (PAGs), represent causal relationships among variables in a model. Methods exist for learning DAGs and PAGs from data and for converting DAGs to PAGs. However, these methods only output a single causal graph consistent with the independencies/dependencies (the Markov equivalence class M) estimated from the data. However, many distinct graphs may be consistent with M, and a data modeler may wish to select among these using domain knowledge. In this paper, we present a method that makes this possible. We introduce PAG2ADMG, the first method for enumerating all causal graphs consistent with M, under certain assumptions. PAG2ADMG converts a given PAG into a set of acyclic directed mixed graphs (ADMGs). We prove the correctness of the approach and demonstrate its efficiency relative to brute-force enumeration.
This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach taken by conditional independence-based causal discovery methods is based on two assumptions: the Markov condition and faithfulness. It has been shown that under these assumptions the causal graph can be identified up to Markov equivalence (some arrows remain undirected) using methods like the PC algorithm. In this work we propose an alternative by defining Identifiable Functional Model Classes (IFMOCs). As our main theorem we prove that if the data generating process belongs to an IFMOC, one can identify the complete causal graph. To the best of our knowledge this is the first identifiability result of this kind that is not limited to linear functional relationships. We discuss how the IFMOC assumption and the Markov and faithfulness assumptions relate to each other and explain why we believe that the IFMOC assumption can be tested more easily on given data. We further provide a practical algorithm that recovers the causal graph from finitely many data; experiments on simulated data support the theoretical findings.
Graphical Markov models use graphs, either undirected, directed, or mixed, to represent possible dependences among statistical variables. Applications of undirected graphs (UDGs) include models for spatial dependence and image analysis, while acyclic directed graphs (ADGs), which are especially convenient for statistical analysis, arise in such fields as genetics and psychometrics and as models for expert systems and Bayesian belief networks. Lauritzen, Wermuth and Frydenberg (LWF) introduced a Markov property for chain graphs, which are mixed graphs that can be used to represent simultaneously both causal and associative dependencies and which include both UDGs and ADGs as special cases. In this paper an alternative Markov property (AMP) for chain graphs is introduced, which in some ways is a more direct extension of the ADG Markov property than is the LWF property for chain graph.
We offer a complete characterization of the set of distributions that could be induced by local interventions on variables governed by a causal Bayesian network. We show that such distributions must adhere to three norms of coherence, and we demonstrate the use of these norms as inferential tools in tasks of learning and identification. Testable coherence norms are subsequently derived for networks containing unmeasured variables.
In this paper, we address the problem of learning the structure of Gaussian chain graph models in a high-dimensional space. Chain graph models are generalizations of undirected and directed graphical models that contain a mixed set of directed and undirected edges. While the problem of sparse structure learning has been studied extensively for Gaussian graphical models and more recently for conditional Gaussian graphical models (CGGMs), there has been little previous work on the structure recovery of Gaussian chain graph models. We consider linear regression models and a re-parameterization of the linear regression models using CGGMs as building blocks of chain graph models. We argue that when the goal is to recover model structures, there are many advantages of using CGGMs as chain component models over linear regression models, including convexity of the optimization problem, computational efficiency, recovery of structured sparsity, and ability to leverage the model structure for semi-supervised learning. We demonstrate our approach on simulated and genomic datasets.