Distributionally Robust Instrumental Variables Estimation

Qu, Zhaonan, Kwon, Yongchan

arXiv.org Machine Learning 

Instrumental variables (IV) estimation, also known as IV regression, is a fundamental method in econometrics and statistics to infer causal relationships in observational data with unobserved confounding. It leverages access to additional variables (instruments) that affect the outcome exogenously and exclusively through the endogenous regressor to yield consistent causal estimates, even when the standard ordinary least squares (OLS) estimator is biased by unobserved confounding (Imbens and Angrist, 1994; Angrist et al., 1996; Imbens and Rubin, 2015). Over the years, IV estimation has become an indispensable tool for causal inference in empirical works in economics (Card and Krueger, 1994), as well as in the study of genetic and epidemiological data (Davey Smith and Ebrahim, 2003). Despite the widespread use of IV in empirical and applied works, it has important limitations and challenges, such as invalid instruments (Sargan, 1958; Murray, 2006), weak instruments (Staiger and Stock, 1997), non-compliance (Imbens and Angrist, 1994), and heteroskedasticity, especially in settings with weak instruments or highly leveraged datasets (Andrews et al., 2019; Young, 2022). These issues could significantly impact the validity and quality of estimation and inference using instrumental variables (Jiang, 2017). Many works have since been devoted to assessing and addressing these issues, such as statistical tests (Hansen, 1982; Stock and Yogo, 2002), sensitivity analysis (Rosenbaum and Rubin, 1983; Bonhomme and Weidner, 2022), and additional assumptions or structures on the data generating process (Kolesár et al., 2015; Kang et al., 2016; Guo et al., 2018b). Recently, an emerging line of works have highlighted interesting connections between causality and the concepts of invariance and robustness (Peters et al., 2016; Meinshausen, 2018; Rothenhäusler et al., 2021; Bühlmann, 2020; Jakobsen and Peters, 2022; Fan et al., 2024). Their guiding philosophy is that causal properties can be viewed as robustness against changes across heterogeneous environments, represented by a set P of data distributions.