We focus on the discovery and identification of direct causes and effects of a target variable in a causal network. State-of-the-art causal learning algorithms generally need to find the global causal structures in the form of complete partial directed acyclic graphs (CPDAG) in order to identify direct causes and effects of a target variable. While these algorithms are effective, it is often unnecessary and wasteful to find the global structures when we are only interested in the local structure of one target variable (such as class labels). We propose a new local causal discovery algorithm,called Causal Markov Blanket (CMB), to identify the direct causes and effects of a target variable based on Markov Blanket Discovery. CMB is designed toconduct causal discovery among multiple variables, but focuses only on finding causal relationships between a specific target variable and other variables. Under standard assumptions, we show both theoretically and experimentally that the proposed local causal discovery algorithm can obtain the comparable identification accuracyas global methods but significantly improve their efficiency, often by more than one order of magnitude.

Claassen, Tom (Radboud University Nijmegen) | Heskes, Tom (Radboud University Nijmegen)

We target the problem of accuracy and robustness in causal inference from finite data sets. Our aim is to combine the inherent robustness of the Bayesian approach with the theoretical strength and clarity of constraint-based methods. We use a Bayesian score to obtain probability estimates on the input statements used in a constraint-based procedure. These are subsequently processed in decreasing order of reliability, letting more reliable decisions take precedence in case of conflicts, until a single output model is obtained. Tests show that a basic implementation of the resulting Bayesian Constraint-based Causal Discovery (BCCD) algorithm already outperforms established procedures such as FCI and Conservative PC. It indicates which causal decisions in the output have high reliability and which do not. The approach is easily adapted to other application areas such as complex independence tests.

A long-standing open research problem is how to use information from different experiments, including background knowledge, to infer causal relations. Recent developments have shown ways to use multiple data sets, provided they originate from identical experiments. We present the MCI-algorithm as the first method that can infer provably valid causal relations in the large sample limit from different experiments. It is fast, reliable and produces very clear and easily interpretable output. It is based on a result that shows that constraint-based causal discovery is decomposable into a candidate pair identification and subsequent elimination step that can be applied separately from different models. We test the algorithm on a variety of synthetic input model sets to assess its behavior and the quality of the output. The method shows promising signs that it can be adapted to suit causal discovery in real-world application areas as well, including large databases.

Wong, Raymond K. (University of New South Wales) | Chu, Victor (University of New South Wales) | Ghanavati, Mojgan (University of New South Wales) | Hamzehei, Asso (University of New South Wales)

Causal structure discovery methods are investigated recently but none of them has taken possible time-varying structure into consideration. This paper uses a notion of causal time-varying dynamic Bayesian network (CTV-DBN) and define a causal boundary to govern cross-time information sharing. CTV-DBN is constructed by using asymmetric kernels to address sample scarcity and to adhere to causal principles; while maintaining good variance and bias trade-off. Upon satisfying causal Markov assumption, causal inference can be made based on manipulation rule. We explore trajectory data collected from taxis in Beijing which exhibit heterogeneous patterns, data sparseness and distribution skewness. Experiments show that by using casual structures and trajectory clustering, we can analyse the spatio-temporal behavior of the trajectory data.