We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.
Studies show that the representations learned by deep neural networks can be transferred to similar prediction tasks in other domains for which we do not have enough labeled data. However, as we transition to higher layers in the model, the representations become more task-specific and less generalizable. Recent research on deep domain adaptation proposed to mitigate this problem by forcing the deep model to learn more transferable feature representations across domains. This is achieved by incorporating domain adaptation methods into deep learning pipeline. The majority of existing models learn the transferable feature representations which are highly correlated with the outcome. However, correlations are not always transferable. In this paper, we propose a novel deep causal representation learning framework for unsupervised domain adaptation, in which we propose to learn domain-invariant causal representations of the input from the source domain. We simulate a virtual target domain using reweighted samples from the source domain and estimate the causal effect of features on the outcomes. The extensive comparative study demonstrates the strengths of the proposed model for unsupervised domain adaptation via causal representations.
The article "Cognitive Hub: The Future of Work" and the supporting infographic (see Figure 1) provides an interesting perspective on some "technology combinations" that could transform the workplace of the future, all enabled by Artificial Intelligence (AI): The infographic above is very cool and depicts a very interesting proposition. However, my concern with the proposition is that while these technology combinations could be quite powerful, the Internet of Things, Human-Machine Interfaces, Cyber physical systems and Artificial Intelligence are only enabling technologies, that is, they only give someone or something the means to do something. You still need someone or something to actually do something; to decide what to do, when to do it, where to do it, with whom to do it, how to do it, the required items to do it, etc. There is a H-U-G-E difference between enabling and doing. For example, I can enable you with an individualized diet and fitness plan that will improve your life, but the subsequent improvement in your life won't happen if you are not doing it.
For decades, researchers in fields, such as the natural and social sciences, have been verifying causal relationships and investigating hypotheses that are now well-established or understood as truth. These causal mechanisms are properties of the natural world, and thus are invariant conditions regardless of the collection domain or environment. We show in this paper how prior knowledge in the form of a causal graph can be utilized to guide model selection, i.e., to identify from a set of trained networks the models that are the most robust and invariant to unseen domains. Our method incorporates prior knowledge (which can be incomplete) as a Structural Causal Model (SCM) and calculates a score based on the likelihood of the SCM given the target predictions of a candidate model and the provided input variables. We show on both publicly available and synthetic datasets that our method is able to identify more robust models in terms of generalizability to unseen out-of-distribution test examples and domains where covariates have shifted.
Those algorithms will be discussed in a later section. This paper describes a statistical discovery procedure for finding causal structure in correlational data, called path analysis lasher, 83; Li, 75] and an algorithm that builds path-analytic models automatically, given data. This work has the same goals as research in function finding and other discovery techniques, that is, to find rules, laws, and mechanisms that underlie nonexperimental data [Falkenhainer & Michalski 86; Langley et al., 87; Schaffer, 90; Zytkow et al., 90]. 1 Whereas function finding algorithms produce functional abstractions of (presumably) causal mechanisms, our algorithm produces explicitly causal models. Our work is most similar to that of Glymour et al. , who built the TETRAD system.