Goto

Collaborating Authors

 Lee, Hyung-Chul


EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports

arXiv.org Artificial Intelligence

We introduce a novel question-answering (QA) dataset using echocardiogram reports sourced from the Medical Information Mart for Intensive Care database. This dataset is specifically designed to enhance QA systems in cardiology, consisting of 771,244 QA pairs addressing a wide array of cardiac abnormalities and their severity. We compare large language models (LLMs), including open-source and biomedical-specific models for zero-shot evaluation, and closed-source models for zero-shot and three-shot evaluation. Our results show that fine-tuning LLMs improves performance across various QA metrics, validating the value of our dataset. Clinicians also qualitatively evaluate the best-performing model to assess the LLM responses for correctness. Further, we conduct fine-grained fairness audits to assess the bias-performance trade-off of LLMs across various social determinants of health. Our objective is to propel the field forward by establishing a benchmark for LLM AI agents aimed at supporting clinicians with cardiac differential diagnoses, thereby reducing the documentation burden that contributes to clinician burnout and enabling healthcare professionals to focus more on patient care.


Unmasking Societal Biases in Respiratory Support for ICU Patients through Social Determinants of Health

arXiv.org Artificial Intelligence

Unmasking Societal Biases in Respiratory Support for ICU Patients through Social Determinants of Health Mira Moukheiber 1, Lama Moukheiber 1, Dana Moukheiber 1 and Hyung-Chul Lee 2, 1 Massachusetts Institute of Technology 2 Seoul National University College of Medicine, Seoul National University Hospital, Department of Anesthesiology and Pain Medicine vital@snu.ac.kr Abstract In critical care settings, where precise and timely interventions are crucial for health outcomes, evaluating disparities in patient outcomes is important. Current approaches often fall short in comprehensively understanding and evaluating the impact of respiratory support interventions on individuals affected by social determinants of health. Attributes such as gender, race, and age are commonly assessed and essential, but provide only a partial view of the complexities faced by diverse populations. In this study, we focus on two clinically motivated tasks: prolonged mechanical ventilation and successful weaning. We also perform fairness audits on the models' predictions across demographic groups and social determinants of health to better understand the health inequities in respiratory interventions in the intensive care unit. We also release a temporal benchmark dataset, verified by clinical experts, to enable benchmarking of clinical respiratory intervention tasks. 1 Introduction Critically-ill patients often find themselves in the intensive care unit (ICU) seeking specialized support for respiratory distress [ Doyle et al., 1995; Ware and Matthay, 2000 ] . Despite advances in supportive treatments, the in-hospital mortality rate remains 40% for conditions such as acute lung injury and acute respiratory distress syndrome [ Rubenfeld et al., 2005; Sweatt and Levitt, 2014 ] .