A Methodology for a Scalable, Collaborative, and Resource-Efficient Platform to Facilitate Healthcare AI Research

Cohen, Raphael Y., Kovacheva, Vesela P.

arXiv.org Artificial Intelligence 

Recent advances in artificial intelligence (AI) in healthcare hold the potential to increase patient safety, augment efficiency and improve patient outcomes. In clinical care, AI technologies can aid physicians in diagnosis and treatment selection, risk prediction and stratification, and improving patient and clinician efficiency [1]. There has been a vast expansion of available AI technologies in the past decade, creating considerable interest in healthcare data science. Yet, most sophisticated AI models exist only in high-profile publications, and only a few are implemented in clinical practice [2, 3]. The barriers to translating data science research into patient care are inadequate data quality, scarce resources, and high patient confidentiality needs. With the Health Information Technology for Economic and Clinical Health Act of 2009, many institutions have transitioned to electronic medical records that provide a rich medical data source. While initially developed for administrative purposes, most electronic health record (EHR) systems store patient data in heterogeneous formats, sometimes combined with legacy systems. In addition to the structured data for medications, laboratory data, and imaging, there are large amounts of unstructured data like physician notes, discharge summaries, and reports. The EHR data has a significant degree of missingness, misclassification, and errors [4].