Treatment Effect Estimation with Data-Driven Variable Decomposition
Kuang, Kun (Tainghua University) | Cui, Peng ( Tsinghua University ) | Li, Bo ( Tsinghua University ) | Jiang, Meng ( University of Illinois Urbana-Champaign ) | Yang, Shiqiang (Tsinghua University) | Wang, Fei ( Cornell University )
One fundamental problem in causal inference is the treatment effect estimation in observational studies when variables are confounded. Control for confounding effect is generally handled by propensity score. But it treats all observed variables as confounders and ignores the adjustment variables, which have no influence on treatment but are predictive of the outcome. Recently, it has been demonstrated that the adjustment variables are effective in reducing the variance of the estimated treatment effect. However, how to automatically separate the confounders and adjustment variables in observational studies is still an open problem, especially in the scenarios of high dimensional variables, which are common in big data era. In this paper, we propose a Data-Driven Variable Decomposition (D$^2$VD) algorithm, which can 1) automatically separate confounders and adjustment variables with a data driven approach, and 2) simultaneously estimate treatment effect in observational studies with high dimensional variables. Under standard assumptions, we show experimentally that the proposed D$^2$VD algorithm can automatically separate the variables precisely, and estimate treatment effect more accurately and with tighter confidence intervals than the state-of-the-art methods on both synthetic data and real online advertising dataset.
Feb-14-2017
- Country:
- Europe (0.14)
- North America > United States
- Illinois (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine > Therapeutic Area
- Immunology (0.46)
- Information Technology > Services (0.49)
- Marketing (0.89)
- Health & Medicine > Therapeutic Area
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning
- Performance Analysis > Accuracy (0.46)
- Statistical Learning > Regression (0.46)
- Communications (0.69)
- Data Science > Data Mining (0.66)
- Artificial Intelligence > Machine Learning
- Information Technology