Contrastive Representation Learning Helps Cross-institutional Knowledge Transfer: A Study in Pediatric Ventilation Management
Liu, Yuxuan, Han, Jinpei, Ramnarayan, Padmanabhan, Faisal, A. Aldo
–arXiv.org Artificial Intelligence
Machine learning has shown promising results in clinical decision support, particularly for complex intensive care settings [Gottesman et al., 2019]. However, developing robust models faces significant challenges: limited data availability, variations in clinical practices across institutions, and restricted data sharing. These constraints often result in models that perform well locally but fail to generalize across different clinical settings [McDermott et al., 2021]. This cross-site generalization problem represents a fundamental challenge in the real-world application of clinical ML, particularly when dealing with longitudinal patient data in Electronic Healthcare Records (EHR). Recent advances in generative AI and large foundation models have demonstrated the power of self-supervised representation learning in capturing transferable features from unlabeled data [Bommasani et al., 2021, Brown, 2020]. This capacity is particularly valuable for EHR applications, where obtaining high-quality labeled data is both costly and resource-intensive. Despite growing interest and successful applications of self-supervised learning to EHR time series data [Rasmy et al., 2021, Tu et al., 2024, Wornow et al., 2023], downstream evaluations have largely been restricted to single-institution settings, where test data, though held out, still originates from the same underlying population as the
arXiv.org Artificial Intelligence
Jan-27-2025