Collaborating Authors

Guo, Ruocheng

Evaluation Methods and Measures for Causal Learning Algorithms Artificial Intelligence

The convenient access to copious multi-faceted data has encouraged machine learning researchers to reconsider correlation-based learning and embrace the opportunity of causality-based learning, i.e., causal machine learning (causal learning). Recent years have therefore witnessed great effort in developing causal learning algorithms aiming to help AI achieve human-level intelligence. Due to the lack-of ground-truth data, one of the biggest challenges in current causal learning research is algorithm evaluations. This largely impedes the cross-pollination of AI and causal inference, and hinders the two fields to benefit from the advances of the other. To bridge from conventional causal inference (i.e., based on statistical methods) to causal learning with big data (i.e., the intersection of causal inference and machine learning), in this survey, we review commonly-used datasets, evaluation methods, and measures for causal learning using an evaluation pipeline similar to conventional machine learning. We focus on the two fundamental causal-inference tasks and causality-aware machine learning tasks. Limitations of current evaluation procedures are also discussed. We then examine popular causal inference tools/packages and conclude with primary challenges and opportunities for benchmarking causal learning algorithms in the era of big data. The survey seeks to bring to the forefront the urgency of developing publicly available benchmarks and consensus-building standards for causal learning evaluation with observational data. In doing so, we hope to broaden the discussions and facilitate collaboration to advance the innovation and application of causal learning.

Learning Fair Node Representations with Graph Counterfactual Fairness Artificial Intelligence

Fair machine learning aims to mitigate the biases of model predictions against certain subpopulations regarding sensitive attributes such as race and gender. Among the many existing fairness notions, counterfactual fairness measures the model fairness from a causal perspective by comparing the predictions of each individual from the original data and the counterfactuals. In counterfactuals, the sensitive attribute values of this individual had been modified. Recently, a few works extend counterfactual fairness to graph data, but most of them neglect the following facts that can lead to biases: 1) the sensitive attributes of each node's neighbors may causally affect the prediction w.r.t. this node; 2) the sensitive attributes may causally affect other features and the graph structure. To tackle these issues, in this paper, we propose a novel fairness notion - graph counterfactual fairness, which considers the biases led by the above facts. To learn node representations towards graph counterfactual fairness, we propose a novel framework based on counterfactual data augmentation. In this framework, we generate counterfactuals corresponding to perturbations on each node's and their neighbors' sensitive attributes. Then we enforce fairness by minimizing the discrepancy between the representations learned from the original graph and the counterfactuals for each node. Experiments on both synthetic and real-world graphs show that our framework outperforms the state-of-the-art baselines in graph counterfactual fairness, and also achieves comparable prediction performance.

Estimating Causal Effects of Multi-Aspect Online Reviews with Multi-Modal Proxies Artificial Intelligence

Online reviews enable consumers to engage with companies and provide important feedback. Due to the complexity of the high-dimensional text, these reviews are often simplified as a single numerical score, e.g., ratings or sentiment scores. This work empirically examines the causal effects of user-generated online reviews on a granular level: we consider multiple aspects, e.g., the Food and Service of a restaurant. Understanding consumers' opinions toward different aspects can help evaluate business performance in detail and strategize business operations effectively. Specifically, we aim to answer interventional questions such as What will the restaurant popularity be if the quality w.r.t. its aspect Service is increased by 10%? The defining challenge of causal inference with observational data is the presence of "confounder", which might not be observed or measured, e.g., consumers' preference to food type, rendering the estimated effects biased and high-variance. To address this challenge, we have recourse to the multi-modal proxies such as the consumer profile information and interactions between consumers and businesses. We show how to effectively leverage the rich information to identify and estimate causal effects of multiple aspects embedded in online reviews. Empirical evaluations on synthetic and real-world data corroborate the efficacy and shed light on the actionable insight of the proposed approach.

Causal Mediation Analysis with Hidden Confounders Artificial Intelligence

An important problem in causal inference is to break down the total effect of treatment into different causal pathways and quantify the causal effect in each pathway. Causal mediation analysis (CMA) is a formal statistical approach for identifying and estimating these causal effects. Central to CMA is the sequential ignorability assumption that implies all pre-treatment confounders are measured and they can capture different types of confounding, e.g., post-treatment confounders and hidden confounders. Typically unverifiable in observational studies, this assumption restrains both the coverage and practicality of conventional methods. This work, therefore, aims to circumvent the stringent assumption by following a causal graph with a unified confounder and its proxy variables. Our core contribution is an algorithm that combines deep latent-variable models and proxy strategy to jointly infer a unified surrogate confounder and estimate different causal effects in CMA from observed variables. Empirical evaluations using both synthetic and semi-synthetic datasets validate the effectiveness of the proposed method.

Out-of-distribution Prediction with Invariant Risk Minimization: The Limitation and An Effective Fix Artificial Intelligence

This work considers the out-of-distribution (OOD) prediction problem where (1)~the training data are from multiple domains and (2)~the test domain is unseen in the training. DNNs fail in OOD prediction because they are prone to pick up spurious correlations. Recently, Invariant Risk Minimization (IRM) is proposed to address this issue. Its effectiveness has been demonstrated in the colored MNIST experiment. Nevertheless, we find that the performance of IRM can be dramatically degraded under \emph{strong $\Lambda$ spuriousness} -- when the spurious correlation between the spurious features and the class label is strong due to the strong causal influence of their common cause, the domain label, on both of them (see Fig. 1). In this work, we try to answer the questions: why does IRM fail in the aforementioned setting? Why does IRM work for the original colored MNIST dataset? How can we fix this problem of IRM? Then, we propose a simple and effective approach to fix the problem of IRM. We combine IRM with conditional distribution matching to avoid a specific type of spurious correlation under strong $\Lambda$ spuriousness. Empirically, we design a series of semi synthetic datasets -- the colored MNIST plus, which exposes the problems of IRM and demonstrates the efficacy of the proposed method.

Long-Term Effect Estimation with Surrogate Representation Artificial Intelligence

There are many scenarios where short- and long-term causal effects of an intervention are different. For example, low-quality ads may increase short-term ad clicks but decrease the long-term revenue via reduced clicks; search engines measured by inappropriate performance metrics may increase search query shares in a short-term but not long-term. This work therefore studies the long-term effect where the outcome of primary interest, or primary outcome, takes months or even years to accumulate. The observational study of long-term effect presents unique challenges. First, the confounding bias causes large estimation error and variance, which can further accumulate towards the prediction of primary outcomes. Second, short-term outcomes are often directly used as the proxy of the primary outcome, i.e., the surrogate. Notwithstanding its simplicity, this method entails the strong surrogacy assumption that is often impractical. To tackle these challenges, we propose to build connections between long-term causal inference and sequential models in machine learning. This enables us to learn surrogate representations that account for the temporal unconfoundedness and circumvent the stringent surrogacy assumption by conditioning on time-varying confounders in the latent space. Experimental results show that the proposed framework outperforms the state-of-the-art.

Causal Interpretability for Machine Learning -- Problems, Methods and Evaluation Machine Learning

Machine learning models have had discernible achievements in a myriad of applications. However, most of these models are black-boxes, and it is obscure how the decisions are made by them. This makes the models unreliable and untrustworthy. To provide insights into the decision making processes of these models, a variety of traditional interpretable models have been proposed. Moreover, to generate more human-friendly explanations, recent work on interpretability tries to answer questions related to causality such as "Why does this model makes such decisions?" or "Was it a specific feature that caused the decision made by the model?". In this work, models that aim to answer causal questions are referred to as causal interpretable models. The existing surveys have covered concepts and methodologies of traditional interpretability. In this work, we present a comprehensive survey on causal interpretable models from the aspects of the problems and methods. In addition, this survey provides in-depth insights into the existing evaluation metrics for measuring interpretability, which can help practitioners understand for what scenarios each evaluation metric is suitable.

Counterfactual Evaluation of Treatment Assignment Functions with Networked Observational Data Machine Learning

Counterfactual evaluation of novel treatment assignment functions (e.g., advertising algorithms and recommender systems) is one of the most crucial causal inference problems for practitioners. Traditionally, randomized controlled trials (A/B tests) are performed to evaluate treatment assignment functions. However, such trials can be time-consuming, expensive, and even unethical in some cases. Therefore, offline counterfactual evaluation of treatment assignment functions becomes a pressing issue because a massive amount of observational data is available in today's big data era. Counterfactual evaluation requires handling the hidden confounders -- the unmeasured features which causally influence both the treatment assignment and the outcome. To deal with the hidden confounders, most of the existing methods rely on the assumption of no hidden confounders. However, this assumption can be untenable in the context of massive observational data. When such data comes with network information, the later can be potentially useful to correct hidden confounding bias. As such, we first formulate a novel problem, counterfactual evaluation of treatment assignment functions with networked observational data. Then, we investigate the following research questions: How can we utilize network information in counterfactual evaluation? Can network information improve the estimates in counterfactual evaluation? Toward answering these questions, first, we propose a novel framework, \emph{Counterfactual Network Evaluator} (CONE), which (1) learns partial representations of latent confounders under the supervision of observed treatments and outcomes; and (2) combines them for counterfactual evaluation. Then through extensive experiments, we corroborate the effectiveness of CONE. The results imply that incorporating network information mitigates hidden confounding bias in counterfactual evaluation.

A Survey of Learning Causality with Data: Problems and Methods Artificial Intelligence

The era of big data provides researchers with convenient access to copious data. However, people often have little knowledge about it. The increasing prevalence of big data is challenging the traditional methods of learning causality because they are developed for the cases with limited amount of data and solid prior causal knowledge. This survey aims to close the gap between big data and learning causality with a comprehensive and structured review of traditional and frontier methods and a discussion about some open problems of learning causality. We begin with preliminaries of learning causality. Then we categorize and revisit methods of learning causality for the typical problems and data types. After that, we discuss the connections between learning causality and machine learning. At the end, some open problems are presented to show the great potential of learning causality with data.

Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects Machine Learning

Modeling spillover effects from observational data is an important problem in economics, business, and other fields of research. % It helps us infer the causality between two seemingly unrelated set of events. For example, if consumer spending in the United States declines, it has spillover effects on economies that depend on the U.S. as their largest export market. In this paper, we aim to infer the causation that results in spillover effects between pairs of entities (or units), we call this effect as \textit{paired spillover}. To achieve this, we leverage the recent developments in variational inference and deep learning techniques to propose a generative model called Linked Causal Variational Autoencoder (LCVA). Similar to variational autoencoders (VAE), LCVA incorporates an encoder neural network to learn the latent attributes and a decoder network to reconstruct the inputs. However, unlike VAE, LCVA treats the \textit{latent attributes as confounders that are assumed to affect both the treatment and the outcome of units}. Specifically, given a pair of units $u$ and $\bar{u}$, their individual treatment and outcomes, the encoder network of LCVA samples the confounders by conditioning on the observed covariates of $u$, the treatments of both $u$ and $\bar{u}$ and the outcome of $u$. Once inferred, the latent attributes (or confounders) of $u$ captures the spillover effect of $\bar{u}$ on $u$. Using a network of users from job training dataset (LaLonde (1986)) and co-purchase dataset from Amazon e-commerce domain, we show that LCVA is significantly more robust than existing methods in capturing spillover effects.