naipw
Multiply Robust Estimator Circumvents Hyperparameter Tuning of Neural Network Models in Causal Inference
Estimation of the Average Treatment Effect (ATE) is often carried out in 2 steps, wherein the first step, the treatment and outcome are modeled, and in the second step the predictions are inserted into the ATE estimator. In the first steps, numerous models can be fit to the treatment and outcome, including using machine learning algorithms. However, it is a difficult task to choose among the hyperparameter sets which will result in the best causal effect estimation and inference. Multiply Robust (MR) estimator allows us to leverage all the first-step models in a single estimator. We show that MR estimator is $n^r$ consistent if one of the first-step treatment or outcome models is $n^r$ consistent. We also show that MR is the solution to a broad class of estimating equations, and is asymptotically normal if one of the treatment models is $\sqrt{n}$-consistent. The standard error of MR is also calculated which does not require a knowledge of the true models in the first step. Our simulations study supports the theoretical findings.
Doubly Robust Estimation with Machine Learning Predictions
Rostami, Mehdi, Saarela, Olli, Escobar, Michael
The estimation of Average Treatment Effect (ATE) as a causal parameter is carried out in two steps, wherein the first step, the treatment, and outcome are modeled to incorporate the potential confounders, and in the second step, the predictions are inserted into the ATE estimators such as the Augmented Inverse Probability Weighting (AIPW) estimator. Due to the concerns regarding the nonlinear or unknown relationships between confounders and the treatment and outcome, there has been an interest in applying non-parametric methods such as Machine Learning (ML) algorithms instead. \cite{farrell2018deep} proposed to use two separate Neural Networks (NNs) where there's no regularization on the network's parameters except the Stochastic Gradient Descent (SGD) in the NN's optimization. Our simulations indicate that the AIPW estimator suffers extensively if no regularization is utilized. We propose the normalization of AIPW (referred to as nAIPW) which can be helpful in some scenarios. nAIPW, provably, has the same properties as AIPW, that is double-robustness and orthogonality \citep{chernozhukov2018double}. Further, if the first step algorithms converge fast enough, under regulatory conditions \citep{chernozhukov2018double}, nAIPW will be asymptotically normal.