This paper presents a new open source Python framework for causal discovery from observational data and domain background knowledge, aimed at causal graph and causal mechanism modeling. The Cdt package implements the end-to-end approach, recovering the direct dependencies (the skeleton of the causal graph) and the causal relationships between variables. It includes algorithms from the'Bnlearn' (Scutari, 2018) and'Pcalg' (Kalisch et al., 2018) packages, together with algorithms for pairwise causal discovery such as ANM (Hoyer et al., 2009).
Explaining why events occur is key to making decisions, assigning blame, and enacting policies. Despite the need, few methods can compute explanations in an automated way. Existing solutions start with a type-level model (e.g. factors affecting risk of disease), and use this to explain token-level events (e.g. cause of an individual's illness). This is limiting, since an individual's illness may be due to a previously unknown drug interaction. We propose a hybrid method for token explanation that uses known type-level models while also discovering potentially novel explanations. On simulated data with ground truth, the approach finds accurate explanations when observations match what is known, and correctly finds novel relationships when they do not. On real world data, our approach finds explanations consistent with intuition.
Despite the major advances taken in causal modeling, causality is still an unfamiliar topic for many statisticians. In this paper, it is demonstrated from the beginning to the end how causal effects can be estimated from observational data assuming that the causal structure is known. To make the problem more challenging, the causal effects are highly nonlinear and the data are missing at random. The tools used in the estimation include causal models with design, causal calculus, multiple imputation and generalized additive models. The main message is that a trained statistician can estimate causal effects by judiciously combining existing tools.
The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuous-valued data linear acyclic causal models are often used because these models are well understood and there are well-known methods to fit them to data. In reality, of course, many causal relationships are more or less nonlinear, raising some doubts as to the applicability and usefulness of purely linear methods. In this contribution we show that in fact the basic linear framework can be generalized to nonlinear models with additive noise. In this extended framework, nonlinearities in the data-generating process are in fact a blessing rather than a curse, as they typically provide information on the underlying causal system and allow more aspects of the true data-generating mechanisms to be identified.