On the instrumental variable estimation with many weak and invalid instruments

Lin, Yiqi, Windmeijer, Frank, Song, Xinyuan, Fan, Qingliang

arXiv.org Machine Learning 

Recently, estimation of causal effects with high-dimensional observational data has drawn much attention in many research fields such as economics, epidemiology and genomics. The instrumental variable (IV) method is widely used when the treatment variable of interest is endogenous. As shown in Figure 1, the ideal IV needs to be correlated with the endogenous treatment variable (C1), it should not have a direct effect on the outcome (C2) and should not be related to unobserved confounders that affect both outcome and treatment (C3). Figure 1: Relevance and Validity of IVs Our research is motivated by the difficulty of finding IVs that satisfy all the above conditions. In applications, invalid IVs (violation of C2 or C3) (Davey Smith and Ebrahim, 2003; Kang et al., 2016; Windmeijer et al., 2019) and weak IVs (concerning the weak correlation in C1) (Bound et al., 1995; Staiger and Stock, 1997) are prevalent. A strand of literature studies the "many weak IVs" problem (Stock et al., 2002; Chao and Swanson, 2005). With the increasing availability of large datasets, IV models are often high-dimensional (Belloni et al., 2012; Lin et al., 2015; Fan and Zhong, 2018), and have potentially weak IVs (Andrews et al., 2018), and invalid IVs (Guo et al., 2018; Windmeijer et al., 2021). Among those problems, we mainly focus on the invalid IV problem, while allowing for potential high-dimensionality and weak signals.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found