What's a good imputation to predict with missing values? Marine Le Morvan 1,2 Julie Josse 4 Gaël Varoquaux

May-29-2025, 00:28:00 GMT–Neural Information Processing Systems

How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical grounding. Here we show that for almost all imputation functions, an impute-then-regress procedure with a powerful learner is Bayes optimal. This result holds for all missing-values mechanisms, in contrast with the classic statistical results that require missing-at-random settings to use imputation in probabilistic modeling.

artificial intelligence, imputation, machine learning, (17 more...)

Neural Information Processing Systems

May-29-2025, 00:28:00 GMT

Conferences PDF

Add feedback

Country:
- Europe > France (0.14)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning
    - Neural Networks (1.00)
    - Statistical Learning > Regression (0.31)
  - Data Science (1.00)
  - Modeling & Simulation (0.93)