Propensity score models are better when post-calibrated

Gutman, Rom, Karavani, Ehud, Shimoni, Yishai

arXiv.org Machine Learning 

The propensity score is defined as the conditional probability of being assigned to a treatment (exposure) given one's observed confounding variables. It is very commonly used in methods for estimating causal effects from observational data, such as inverse probability weighting [1], propensity matching [2, 3], propensity stratification [4], as well as many doubly-robust methods [5, 6, 7, 8] Rosenbaum and Rubin [2] set up theoretical guaranties ensuring that adjusting for the propensity score, instead of the covariates themselves, is sufficient in order to achieve the conditional exchangeability needed to estimate a causal effect. However, while these theoretical guarantees require the true conditional probabilities, when applied in practice, not every model that inputs data and outputs a number between zero and one, correctly estimates true probabilities. The scores might not reliably represent true probabilities. A prediction model that accurately outputs probabilities is referred to as calibrated (note this is unrelated to a previous notion of "propensity score calibration" from [9]). Calibration can be empirically evaluated with calibration curve (reliability curves), comparing the predicted scores with their corresponding rate of labels [10].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found