Patchwork Learning: A Paradigm Towards Integrative Analysis across Diverse Biomedical Data Sources
Rajendran, Suraj, Pan, Weishen, Sabuncu, Mert R., Chen, Yong, Zhou, Jiayu, Wang, Fei
–arXiv.org Artificial Intelligence
Machine learning (ML) in healthcare presents numerous opportunities for enhancing patient care, population health, and healthcare providers' workflows. However, the real-world clinical and cost benefits remain limited due to challenges in data privacy, heterogeneous data sources, and the inability to fully leverage multiple data modalities. In this perspective paper, we introduce "patchwork learning" (PL), a novel paradigm that addresses these limitations by integrating information from disparate datasets composed of different data modalities (e.g., clinical free-text, medical images, omics) and distributed across separate and secure sites. PL allows the simultaneous utilization of complementary data sources while preserving data privacy, enabling the development of more holistic and generalizable ML models. We present the concept of patchwork learning and its current implementations in healthcare, exploring the potential opportunities and applicable data sources for addressing various healthcare challenges. PL leverages bridging modalities or overlapping feature spaces across sites to facilitate information sharing and impute missing data, thereby addressing related prediction tasks. We discuss the challenges associated with PL, many of which are shared by federated and multimodal learning, and provide recommendations for future research in this field. By offering a more comprehensive approach to healthcare data integration, patchwork learning has the potential to revolutionize the clinical applicability of ML models. This paradigm promises to strike a balance between personalization and generalizability, ultimately enhancing patient experiences, improving population health, and optimizing healthcare providers' workflows. Introduction Machine learning (ML) in healthcare is a rapidly evolving field, presenting numerous opportunities for progress. Active and passive patient data collection, both during and outside medical care, can be utilized to address health challenges. As a result, ML has become an essential tool for processing and analyzing these data in various domains, including natural language processing, computer vision, and more. ML systems have demonstrated their potential to enhance patient experiences, improve population health, reduce per capita healthcare costs, and optimize healthcare providers' workflows Data privacy is a major challenge facing the use of ML in healthcare, as it restricts the potential for pooling electronic health record (EHR) data from multiple sites. While single modality models exist (e.g., clinical notes, lab tests, omics, or medical images), systems that simultaneously leverage multiple modalities are relatively scarce. MML combines disparate data sources to capitalize on complementary information, thereby improving performance.
arXiv.org Artificial Intelligence
May-13-2023
- Country:
- North America > United States
- Michigan (0.28)
- New York (0.28)
- Pennsylvania (0.28)
- North America > United States
- Genre:
- Research Report
- Experimental Study (1.00)
- Promising Solution (0.92)
- Strength High (0.67)
- Research Report
- Industry:
- Health & Medicine
- Diagnostic Medicine > Imaging (1.00)
- Health Care Technology > Medical Record (1.00)
- Therapeutic Area
- Cardiology/Vascular Diseases (0.93)
- Neurology (0.93)
- Oncology (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine
- Technology: