Collaborating Authors

Informative Subspace Learning for Counterfactual Inference

AAAI Conferences

Inferring causal relations from observational data is widely used for knowledge discovery in healthcare and economics. To investigate whether a treatment can affect an outcome of interest, we focus on answering counterfactual questions of this type: what would a patient’s blood pressure be had he/she received a different treatment? Nearest neighbor matching (NNM) sets the counterfactual outcome of any treatment (control) sample to be equal to the factual outcome of its nearest neighbor in the control (treatment) group. Although being simple, flexible and interpretable, most NNM approaches could be easily misled by variables that do not affect the outcome. In this paper, we address this challenge by learning subspaces that are predictive of the outcome variable for both the treatment group and control group. Applying NNM in the learned subspaces leads to more accurate estimation of the counterfactual outcomes and therefore treatment effects. We introduce an informative subspace learning algorithm by maximizing the nonlinear dependence between the candidate subspace and the outcome variable measured by the Hilbert-Schmidt Independence Criterion (HSIC). We propose a scalable estimator of HSIC, called HSIC-RFF that reduces the quadratic computational and storage complexities (with respect to the sample size) of the naive HSIC implementation to linear through constructing random Fourier features. We also prove an upper bound on the approximation error of the HSIC-RFF estimator. Experimental results on simulated datasets and real-world datasets demonstrate our proposed approach outperforms existing NNM approaches and other commonly used regression-based methods for counterfactual inference.

Propensity Score Matching in R


The concept of Propensity score matching (PSM) was first introduced by Rosenbaum and Rubin (1983) in a paper entitled "The Central Role of the Propensity Score in Observational Studies for Casual Effects." Propensity scores are an alternative method to estimate the effect of receiving treatment when random assignment of treatments to subjects is not feasible. PSM refers to the pairing of treatment and control units with similar values on the propensity score; and possibly other covariates (the characteristics of participants); and the discarding of all unmatched units. What is PSM in simple terms... PSM is done on observational studies. It is done to remove the selection bias between the treatment and the control groups.

Generalized Causal Tree for Uplift Modeling Machine Learning

Uplift modeling is crucial in various applications ranging from marketing and policy-making to personalized recommendations. The main objective is to learn optimal treatment allocations for a heterogeneous population. A primary line of existing work modifies the loss function of the decision tree algorithm to identify cohorts with heterogeneous treatment effects. Another line of work estimates the individual treatment effects separately for the treatment group and the control group using off-the-shelf supervised learning algorithms. The former approach that directly models the heterogeneous treatment effect is known to outperform the latter in practice. However, the existing tree-based methods are mostly limited to a single treatment and a single control use case, except for a handful of extensions to multiple discrete treatments. In this paper, we fill this gap in the literature by proposing a generalization to the tree-based approaches to tackle multiple discrete and continuous-valued treatments. We focus on a generalization of the well-known causal tree algorithm due to its desirable statistical properties, but our generalization technique can be applied to other tree-based approaches as well. We perform extensive experiments to showcase the efficacy of our method when compared to other methods.

Uplift Modeling for Multiple Treatments with Cost Optimization Machine Learning

--Uplift modeling is an emerging machine learning approach for estimating the treatment effect at an individual or subgroup level. It can be used for optimizing the performance of interventions such as marketing campaigns and product designs. Uplift modeling can be used to estimate which users are likely to benefit from a treatment and then prioritize delivering or promoting the preferred experience to those users. An important but so far neglected use case for uplift modeling is an experiment with multiple treatment groups that have different costs, such as for example when different communication channels and promotion types are tested simultaneously. In this paper, we extend standard uplift models to support multiple treatment groups with different costs. We evaluate the performance of the proposed models using both synthetic and real data. We also describe a production implementation of the approach. Uplift modeling [1]-[8] is a technique to estimate and predict the individual-level or subgroup-level causal effects of different treatments in an experiment. This type of information is useful for designing and offering a personalized experience to improve user experience, satisfaction, and engagement. Uplift modeling is therefore commonly used in areas such as marketing, customer service, and product offering. It is helpful to think about uplift modeling in the context of randomized experiments (also known as A/B testing [9]-[11]). In a typical experiment, users are randomly assigned to each treatment group and causal effects are then estimated for the population.

The Difference Between Relative Risk and Odds Ratios


Odds Ratios and Relative Risks are often confused despite being unique concepts. And unfortunately, the names are sometimes used interchangeably. They shouldn't be because they're actually interpreted differently. So it's important to keep them separate and to be precise in the language you use. The basic difference is that the odds ratio is a ratio of two odds (yep, it's that obvious) whereas the relative risk is a ratio of two probabilities.