Youn, Jiwon
Brain-Aware Readout Layers in GNNs: Advancing Alzheimer's early Detection and Neuroimaging
Youn, Jiwon, Kang, Dong Woo, Lim, Hyun Kook, Kim, Mansu
Alzheimer's disease (AD) is a neurodegenerative disorder characterized by progressive memory and cognitive decline, affecting millions worldwide. Diagnosing AD is challenging due to its heterogeneous nature and variable progression. This study introduces a novel brain-aware readout layer (BA readout layer) for Graph Neural Networks (GNNs), designed to improve interpretability and predictive accuracy in neuroimaging for early AD diagnosis. By clustering brain regions based on functional connectivity and node embedding, this layer improves the GNN's capability to capture complex brain network characteristics. We analyzed neuroimaging data from 383 participants, including both cognitively normal and preclinical AD individuals, using T1-weighted MRI, resting-state fMRI, and FBB-PET to construct brain graphs. Our results show that GNNs with the BA readout layer significantly outperform traditional models in predicting the Preclinical Alzheimer's Cognitive Composite (PACC) score, demonstrating higher robustness and stability. The adaptive BA readout layer also offers enhanced interpretability by highlighting task-specific brain regions critical to cognitive functions impacted by AD. These findings suggest that our approach provides a valuable tool for the early diagnosis and analysis of Alzheimer's disease.
CXR-LLAVA: a multimodal large language model for interpreting chest X-ray images
Lee, Seowoo, Youn, Jiwon, Kim, Hyungjin, Kim, Mansu, Yoon, Soon Ho
Purpose: This study aimed to develop an open-source multimodal large language model (CXR-LLAVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists Materials and Methods: For training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2). After pre-training a vision transformer with Dataset 1, we integrated it with an LLM influenced by the LLAVA network. Then, the model was fine-tuned, primarily using Dataset 2. The model's diagnostic performance for major pathological findings was evaluated, along with the acceptability of radiologic reports by human radiologists, to gauge its potential for autonomous reporting. Results: The model demonstrated impressive performance in test sets, achieving an average F1 score of 0.81 for six major pathological findings in the MIMIC internal test set and 0.62 for seven major pathological findings in the external test set. The model's F1 scores surpassed those of GPT-4-vision and Gemini-Pro-Vision in both test sets. In human radiologist evaluations of the external test set, the model achieved a 72.7% success rate in autonomous reporting, slightly below the 84.0% rate of ground truth reports. Conclusion: This study highlights the significant potential of multimodal LLMs for CXR interpretation, while also acknowledging the performance limitations. Despite these challenges, we believe that making our model open-source will catalyze further research, expanding its effectiveness and applicability in various clinical contexts. CXR-LLAVA is available at https://github.com/ECOFRI/CXR_LLAVA.