Ding, Jun-En
MedPlan:A Two-Stage RAG-Based System for Personalized Medical Plan Generation
Hsu, Hsin-Ling, Dao, Cong-Tinh, Wang, Luning, Shuai, Zitao, Phan, Thao Nguyen Minh, Ding, Jun-En, Liao, Chun-Chieh, Hu, Pengfei, Han, Xiaoxue, Hsu, Chih-Ho, Luo, Dongsheng, Peng, Wen-Chih, Liu, Feng, Hung, Fang-Ming, Wu, Chenwei
Despite recent success in applying large language models (LLMs) to electronic health records (EHR), most systems focus primarily on assessment rather than treatment planning. We identify three critical limitations in current approaches: they generate treatment plans in a single pass rather than following the sequential reasoning process used by clinicians; they rarely incorporate patient-specific historical context; and they fail to effectively distinguish between subjective and objective clinical information. Motivated by the SOAP methodology (Subjective, Objective, Assessment, Plan), we introduce MedPlan, a novel framework that structures LLM reasoning to align with real-life clinician workflows. Our approach employs a two-stage architecture that first generates a clinical assessment based on patient symptoms and objective data, then formulates a structured treatment plan informed by this assessment and enriched with patient-specific information through retrieval-augmented generation. Comprehensive evaluation demonstrates that our method significantly outperforms baseline approaches in both assessment accuracy and treatment plan quality.
NeuroTree: Hierarchical Functional Brain Pathway Decoding for Mental Health Disorders
Ding, Jun-En, Luo, Dongsheng, Zilverstand, Anna, Liu, Feng
Analyzing functional brain networks using functional magnetic resonance imaging (fMRI) is crucial for understanding psychiatric disorders and addictive behaviors. While existing fMRI-based graph convolutional networks (GCNs) show considerable promise for feature extraction, they often fall short in characterizing complex relationships between brain regions and demographic factors and accounting for interpretable variables linked to psychiatric conditions. We propose NeuroTree to overcome these limitations, integrating a k-hop AGE-GCN with neural ordinary differential equations (ODEs). This framework leverages an attention mechanism to optimize functional connectivity (FC), thereby enhancing dynamic FC feature learning for brain disease classification. Furthermore, NeuroTree effectively decodes fMRI network features into tree structures, which improves the capture of high-order brain regional pathway features and enables the identification of hierarchical neural behavioral patterns essential for understanding disease-related brain subnetworks. Our empirical evaluations demonstrate that NeuroTree achieves state-of-the-art performance across two distinct mental disorder datasets and provides valuable insights into age-related deterioration patterns. These findings underscore the model's efficacy in predicting psychiatric disorders and elucidating their underlying neural mechanisms.
Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs
Restrepo, David, Wu, Chenwei, Tang, Zhengxu, Shuai, Zitao, Phan, Thao Nguyen Minh, Ding, Jun-En, Dao, Cong-Tinh, Gallifant, Jack, Dychiao, Robyn Gayle, Artiaga, Jose Carlo, Bando, Andrรฉ Hiroshi, Gracitelli, Carolina Pelegrini Barbosa, Ferrer, Vincenz, Celi, Leo Anthony, Bitterman, Danielle, Morley, Michael G, Nakayama, Luis Filipe
Current ophthalmology clinical workflows are plagued by over-referrals, long waits, and complex and heterogeneous medical records. Large language models (LLMs) present a promising solution to automate various procedures such as triaging, preliminary tests like visual acuity assessment, and report summaries. However, LLMs have demonstrated significantly varied performance across different languages in natural language question-answering tasks, potentially exacerbating healthcare disparities in Low and Middle-Income Countries (LMICs). This study introduces the first multilingual ophthalmological question-answering benchmark with manually curated questions parallel across languages, allowing for direct cross-lingual comparisons. Our evaluation of 6 popular LLMs across 7 different languages reveals substantial bias across different languages, highlighting risks for clinical deployment of LLMs in LMICs. Existing debiasing methods such as Translation Chain-of-Thought or Retrieval-augmented generation (RAG) by themselves fall short of closing this performance gap, often failing to improve performance across all languages and lacking specificity for the medical domain. To address this issue, We propose CLARA (Cross-Lingual Reflective Agentic system), a novel inference time de-biasing method leveraging retrieval augmented generation and self-verification. Our approach not only improves performance across all languages but also significantly reduces the multilingual bias gap, facilitating equitable LLM application across the globe.
MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models
Phan, Thao Minh Nguyen, Dao, Cong-Tinh, Wu, Chenwei, Wang, Jian-Zhe, Liu, Shun, Ding, Jun-En, Restrepo, David, Liu, Feng, Hung, Fang-Ming, Peng, Wen-Chih
Electronic health records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to get a clearer picture of patients' health and support clinical decision-making. However, most EHR predictive models do not reflect these procedures, as they either focus on a single modality or overlook the inter-modality interactions/redundancy. In this work, we propose MEDFuse, a Multimodal EHR Data Fusion framework that incorporates masked lab-test modeling and large language models (LLMs) to effectively integrate structured and unstructured medical data. MEDFuse leverages multimodal embeddings extracted from two sources: LLMs fine-tuned on free clinical text and masked tabular transformers trained on structured lab test results. We design a disentangled transformer module, optimized by a mutual information loss to 1) decouple modality-specific and modality-shared information and 2) extract useful joint representation from the noise and redundancy present in clinical notes. Through comprehensive validation on the public MIMIC-III dataset and the in-house FEMH dataset, MEDFuse demonstrates great potential in advancing clinical predictions, achieving over 90% F1 score in the 10-disease multi-label classification task.
Large Language Multimodal Models for 5-Year Chronic Disease Cohort Prediction Using EHR Data
Ding, Jun-En, Thao, Phan Nguyen Minh, Peng, Wen-Chih, Wang, Jian-Zhe, Chug, Chun-Cheng, Hsieh, Min-Chen, Tseng, Yun-Chien, Chen, Ling, Luo, Dongsheng, Wang, Chi-Te, Chen, Pei-fu, Liu, Feng, Hung, Fang-Ming
Chronic diseases such as diabetes are the leading causes of morbidity and mortality worldwide. Numerous research studies have been attempted with various deep learning models in diagnosis. However, most previous studies had certain limitations, including using publicly available datasets (e.g. MIMIC), and imbalanced data. In this study, we collected five-year electronic health records (EHRs) from the Taiwan hospital database, including 1,420,596 clinical notes, 387,392 laboratory test results, and more than 1,505 laboratory test items, focusing on research pre-training large language models. We proposed a novel Large Language Multimodal Models (LLMMs) framework incorporating multimodal data from clinical notes and laboratory test results for the prediction of chronic disease risk. Our method combined a text embedding encoder and multi-head attention layer to learn laboratory test values, utilizing a deep neural network (DNN) module to merge blood features with chronic disease semantics into a latent space. In our experiments, we observe that clinicalBERT and PubMed-BERT, when combined with attention fusion, can achieve an accuracy of 73% in multiclass chronic diseases and diabetes prediction. By transforming laboratory test values into textual descriptions and employing the Flan T-5 model, we achieved a 76% Area Under the ROC Curve (AUROC), demonstrating the effectiveness of leveraging numerical text data for training and inference in language models. This approach significantly improves the accuracy of early-stage diabetes prediction.
Sequential Graph Attention Learning for Predicting Dynamic Stock Trends (Student Abstract)
Lai, Tzu-Ya, Cheng, Wen Jung, Ding, Jun-En
The stock market is characterized by a complex relationship between companies and the market. This study combines a sequential graph structure with attention mechanisms to learn global and local information within temporal time. Specifically, our proposed "GAT-AGNN" module compares model performance across multiple industries as well as within single industries. The results show that the proposed framework outperforms the state-of-the-art methods in predicting stock trends across multiple industries on Taiwan Stock datasets.
Hospital transfer risk prediction for COVID-19 patients from a medicalized hotel based on Diffusion GraphSAGE
Ding, Jun-En, Hsu, Chih-Ho, Ling, Kuan-Chia, Chen, Ling, Hung, Fang-Ming
The global COVID-19 pandemic has caused more than six million deaths worldwide. Medicalized hotels were established in Taiwan as quarantine facilities for COVID-19 patients with no or mild symptoms. Due to limited medical care available at these hotels, it is of paramount importance to identify patients at risk of clinical deterioration. This study aimed to develop and evaluate a graph-based deep learning approach for progressive hospital transfer risk prediction in a medicalized hotel setting. Vital sign measurements were obtained for 632 patients and daily patient similarity graphs were constructed. Inductive graph convolutional network models were trained on top of the temporally integrated graphs to predict hospital transfer risk. The proposed models achieved AUC scores above 0.83 for hospital transfer risk prediction based on the measurements of past 1, 2, and 3 days, outperforming baseline machine learning methods. A post-hoc analysis on the constructed diffusion-based graph using Local Clustering Coefficient discovered a high-risk cluster with significantly older mean age, higher body temperature, lower SpO2, and shorter length of stay. Further time-to-hospital-transfer survival analysis also revealed a significant decrease in survival probability in the discovered high-risk cluster. The obtained results demonstrated promising predictability and interpretability of the proposed graph-based approach. This technique may help preemptively detect high-risk patients at community-based medical facilities similar to a medicalized hotel.