Combining Observational and Randomized Data for Estimating Heterogeneous Treatment Effects
Hatt, Tobias, Berrevoets, Jeroen, Curth, Alicia, Feuerriegel, Stefan, van der Schaar, Mihaela
Estimating heterogeneous treatment effects is an important problem across many domains. In order to accurately estimate such treatment effects, one typically relies on data from observational studies or randomized experiments. Currently, most existing works rely exclusively on observational data, which is often confounded and, hence, yields biased estimates. While observational data is confounded, randomized data is unconfounded, but its sample size is usually too small to learn heterogeneous treatment effects. In this paper, we propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data via representation learning. In particular, we introduce a two-step framework: first, we use observational data to learn a shared structure (in form of a representation); and then, we use randomized data to learn the data-specific structures. We analyze the finite sample properties of our framework and compare them to several natural baselines. As such, we derive conditions for when combining observational and randomized data is beneficial, and for when it is not. Based on this, we introduce a sample-efficient algorithm, called CorNet. We use extensive simulation studies to verify the theoretical properties of CorNet and multiple real-world datasets to demonstrate our method's superiority compared to existing methods.
Feb-25-2022
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- Germany > Bavaria
- North America > United States
- Tennessee (0.04)
- Asia > Middle East
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Strength High (1.00)
- Research Report
- Industry:
- Education (1.00)
- Health & Medicine
- Epidemiology (0.93)
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.45)
- Statistical Learning > Regression (0.45)
- Natural Language (1.00)
- Representation & Reasoning (0.67)
- Machine Learning
- Data Science > Data Mining (1.00)
- Information Management (0.92)
- Modeling & Simulation (0.87)
- Artificial Intelligence
- Information Technology