Doubly robust identification of treatment effects from multiple environments

De Bartolomeis, Piersilvio, Kostin, Julia, Abad, Javier, Wang, Yixin, Yang, Fanny

arXiv.org Machine Learning 

Treatment effects are key quantities of interest in applied domains such as medicine and social sciences, as they determine the impact of interventions like novel treatments or policies on outcomes of interest. To achieve this goal, researchers often rely on randomized trials since randomizing the treatment assignment guarantees unbiased treatment effect estimates under mild assumptions. However, methods relying on randomized data face several issues, such as small sample sizes, sample populations that do not reflect those seen in the real world, and ethical or financial constraints. As a result, there is growing interest in using observational data to estimate treatment effects. A fundamental challenge in using observational data is the selection of a valid adjustment set, i.e. a set of covariates that can be used to identify and estimate the treatment effect. Although criteria for identifying valid adjustment sets are well-established, they rely on the knowledge of the underlying causal graph. When the graph is not known, practitioners often adjust for all available covariates [5]. Yet, this approach runs the risk of including bad controls--covariates that open backdoor paths between the treatment (T) and the outcome (Y), thereby introducing bias into the treatment effect estimate.