RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records
Park, Sangjoon, Wee, Chan Woo, Choi, Seo Hee, Kim, Kyung Hwan, Chang, Jee Suk, Yoon, Hong In, Lee, Ik Jae, Kim, Yong Bae, Cho, Jaeho, Keum, Ki Chang, Lee, Chang Geol, Byun, Hwa Kyung, Koom, Woong Sub
–arXiv.org Artificial Intelligence
Research in context Evidence before this study We performed a comprehensive PubMed search for articles published in English up to August 1, 2024, using the search terms "radiotherapy" or "radiation therapy" in combination with "survival prediction" or "mortality prediction." This search yielded a total of 345 studies. The majority of these studies focused on survival prediction for specific cancer types, with relatively few addressing survival prediction following radiotherapy more broadly. Most of the identified studies employed statistical models requiring manually structured variables that are not easily extractable from electronic health records (EHRs). Only four studies utilized variables that could be easily extracted from EHRs for survival prediction, but these studies lacked critical information about disease status and overall patient condition, which are typically captured in unstructured EHR data. Instead, they relied on traditional, structured data such as blood test results or national registry information, or small datasets that were manually structured. Notably, no studies employed advanced flexible models, such as large language models (LLMs), to automate the structuring of unstructured data and incorporate it into survival prediction. Added value of this study Our findings suggest the potential of LLMs to process extensive unstructured data, which would be impractical for manual structuring. LLMs demonstrated high accuracy in structuring unstructured data, even without extensive tuning, using a single-shot example approach. Our study is the first to demonstrate that the appropriate application of LLMs can improve the prognosis of patients and the quality of healthcare delivery. Implications of all the available evidence The RT-Surv framework developed in this study has broad applications beyond radiation oncology. As unstructured clinical records form the basis of EHR data across all medical specialties, this framework can be adapted to reduce overall hospital mortality rates, predict length of stay, and assess complication risks. Its ability to automatically structure large volumes of unstructured data enables more accurate and efficient use of clinical data across various domains.
arXiv.org Artificial Intelligence
Sep-13-2024
- Country:
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine
- Health Care Technology > Medical Record (1.00)
- Nuclear Medicine (1.00)
- Therapeutic Area > Oncology (1.00)
- Health & Medicine
- Technology: