MedSlice: Fine-Tuned Large Language Models for Secure Clinical Note Sectioning
Davis, Joshua, Sounack, Thomas, Sciacca, Kate, Brain, Jessie M, Durieux, Brigitte N, Agaronnik, Nicole D, Lindvall, Charlotta
–arXiv.org Artificial Intelligence
Extracting sections from clinical notes is crucial for downstream analysis but is challenging due to variability in formatting and labor-intensive nature of manual sectioning. While proprietary large language models (LLMs) have shown promise, privacy concerns limit their accessibility. This study develops a pipeline for automated note sectioning using open-source LLMs, focusing on three sections: History of Present Illness, Interval History, and Assessment and Plan. We fine-tuned three open-source LLMs to extract sections using a curated dataset of 487 progress notes, comparing results relative to proprietary models (GPT-4o, GPT-4o mini). Internal and external validity were assessed via precision, recall and F1 score. Fine-tuned Llama 3.1 8B outperformed GPT-4o (F1=0.92). On the external validity test set, performance remained high (F1= 0.85). Fine-tuned open-source LLMs can surpass proprietary models in clinical note sectioning, offering advantages in cost, performance, and accessibility.
arXiv.org Artificial Intelligence
Jan-23-2025
- Country:
- North America
- Canada > Quebec
- Montreal (0.04)
- United States (0.14)
- Canada > Quebec
- North America
- Genre:
- Research Report
- Experimental Study (0.94)
- New Finding (0.93)
- Research Report
- Industry:
- Technology: