What's a good imputation to predict with missing values? Marine Le Morvan 1,2 Julie Josse 4 Gaël Varoquaux
–Neural Information Processing Systems
How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical grounding. Here we show that for almost all imputation functions, an impute-then-regress procedure with a powerful learner is Bayes optimal. This result holds for all missing-values mechanisms, in contrast with the classic statistical results that require missing-at-random settings to use imputation in probabilistic modeling.
Neural Information Processing Systems
May-29-2025, 00:28:00 GMT
- Technology: