Goto

Collaborating Authors

 pc-algorithm


Metrics on Markov Equivalence Classes for Evaluating Causal Discovery Algorithms

arXiv.org Machine Learning

Many state-of-the-art causal discovery methods aim to generate an output graph that encodes the graphical separation and connection statements of the causal graph that underlies the data-generating process. In this work, we argue that an evaluation of a causal discovery method against synthetic data should include an analysis of how well this explicit goal is achieved by measuring how closely the separations/connections of the method's output align with those of the ground truth. We show that established evaluation measures do not accurately capture the difference in separations/connections of two causal graphs, and we introduce three new measures of distance called s/c-distance, Markov distance and Faithfulness distance that address this shortcoming. We complement our theoretical analysis with toy examples, empirical experiments and pseudocode.


Shapley-PC: Constraint-based Causal Structure Learning with Shapley Values

arXiv.org Artificial Intelligence

Causal Structure Learning (CSL), amounting to extracting causal relations among the variables in a dataset, is widely perceived as an important step towards robust and transparent models. Constraint-based CSL leverages conditional independence tests to perform causal discovery. We propose Shapley-PC, a novel method to improve constraint-based CSL algorithms by using Shapley values over the possible conditioning sets to decide which variables are responsible for the observed conditional (in)dependences. We prove soundness and asymptotic consistency and demonstrate that it can outperform state-of-the-art constraint-based, search-based and functional causal model-based methods, according to standard metrics in CSL.


The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks

arXiv.org Machine Learning

Directed acyclic graphs, or DAGs, are commonly used to repre sent causal relationships in complex biological systems. For example, in gene regulatory ne tworks, directed edges represent regulatory interactions among genes, which are represente d as nodes of the graph. While causal effects in biological networks can be accurately inferred fro m perturbation experiments [33]-- including single or double gene knockouts [30, 42]--these ar e costly to run. Estimating DAGs from observational data is thus an important exploratory ta sk for generating causal hypotheses [10, 15], and designing more efficient experiments. Since the number of possible directed graphs grows super-ex ponentially in the number of nodes, estimation of DAGs is an NPhard problem [6]. Methods of estimating DAGs from observational data can be broadly categorized into three cl asses. The first class, score-based methods, search over the space of all possible graphs, and at tempt to maximize a goodness-of-fit score, generally using a greedy algorithm.


Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs

arXiv.org Machine Learning

Directed acyclic graphs (DAGs) are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical, as well as biological systems, where directed edges between nodes represent the influence of components of the system on each other. The general problem of estimating DAGs from observed data is computationally NP-hard, Moreover two directed graphs may be observationally equivalent. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose a penalized likelihood approach that directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso penalties are considered and an efficient algorithm is proposed for estimation of high dimensional DAGs. We study variable selection consistency of the two penalties when the number of variables grows to infinity with the sample size. We show that although lasso can only consistently estimate the true network under stringent assumptions, adaptive lasso achieves this task under mild regularity conditions. The performance of the proposed methods is compared to alternative methods in simulated, as well as real, data examples.