Propensity score models are better when post-calibrated
Gutman, Rom, Karavani, Ehud, Shimoni, Yishai
The propensity score is defined as the conditional probability of being assigned to a treatment (exposure) given one's observed confounding variables. It is very commonly used in methods for estimating causal effects from observational data, such as inverse probability weighting [1], propensity matching [2, 3], propensity stratification [4], as well as many doubly-robust methods [5, 6, 7, 8] Rosenbaum and Rubin [2] set up theoretical guaranties ensuring that adjusting for the propensity score, instead of the covariates themselves, is sufficient in order to achieve the conditional exchangeability needed to estimate a causal effect. However, while these theoretical guarantees require the true conditional probabilities, when applied in practice, not every model that inputs data and outputs a number between zero and one, correctly estimates true probabilities. The scores might not reliably represent true probabilities. A prediction model that accurately outputs probabilities is referred to as calibrated (note this is unrelated to a previous notion of "propensity score calibration" from [9]). Calibration can be empirically evaluated with calibration curve (reliability curves), comparing the predicted scores with their corresponding rate of labels [10].
Nov-2-2022
- Country:
- Asia > Middle East
- Israel > Haifa District
- Haifa (0.04)
- Jordan (0.04)
- Israel > Haifa District
- North America > United States
- New York (0.04)
- Asia > Middle East
- Genre:
- Research Report
- Experimental Study (0.33)
- New Finding (0.50)
- Research Report
- Industry:
- Health & Medicine (0.94)