Distributionally Robust Transfer Learning with Structurally Missing Covariates, with Application to Cross-National Cardiac Arrest Prediction
Li, Siqi, Hong, Chuan, Tian, Ziye, Leong, Benjamin Sieu-Hon, Nakagawa, Koshi, Tanaka, Hideharu, Shin, Sang Do, Dai, Khuong Quoc, Son, Do Ngoc, Ong, Marcus Eng Hock, Liu, Nan, Liu, Molei
Deploying clinical prediction models across healthcare systems often fails when key training covariates are unavailable at deployment and labeled outcomes are limited in the target domain. For example, high-performing models for out-of-hospital cardiac arrest (OHCA) rely on detailed prehospital measurements routinely collected in high-resource settings but unavailable in many international registries. Existing methods either discard missing covariates, sacrificing predictive information, or rely on untestable assumptions about their target distribution. We propose DRUM (\underline{D}istributionally \underline{R}obust \underline{U}nsupervised transfer learning with structurally \underline{M}issing covariates), a framework that transfers prediction models to target populations where certain covariates are structurally absent and outcome labels are unavailable. DRUM partitions covariates into shared components ($X$), observed across all settings, and missing components ($A$), observed only in the source. Rather than imputing missing covariates, DRUM optimizes worst-case predictive performance over the unknown target distribution of $A \mid X$ using a neural network generator, with a robustness parameter controlling allowable deviation from the source conditional. We further develop a bias correction procedure that reduces sensitivity to nuisance estimation error. Simulations show substantial improvements in both mean and worst-case prediction error under distribution shift. Applied to cross-national OHCA prediction, transferring models from a US registry to multiple Asian registries where prehospital variables are unrecorded, DRUM yields better-calibrated predictions and improved clinical classification performance across sites.
May-26-2026
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.68)
- Research Report
- Industry:
- Technology:
- Information Technology
- Modeling & Simulation (1.00)
- Data Science (1.00)
- Artificial Intelligence > Machine Learning
- Statistical Learning (0.92)
- Neural Networks (0.66)
- Transfer Learning (0.61)
- Information Technology