LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation
Li, Xiaodi, Chowdhury, Shaika, Wi, Chung Il, Vassilaki, Maria, Liu, Ken, Sio, Terence T, Garrick, Owen, Juhn, Young J, Cerhan, James R, Tao, Cui, Zong, Nansu
–arXiv.org Artificial Intelligence
Patient matching is the process of linking patients to appropriate clinical trials by accurately identifying and matching their medical records with trial eligibility criteria. We propose LLM-Match, a novel framework for patient matching leveraging fine-tuned open-source large language models. Our approach consists of four key components. First, a retrieval-augmented generation (RAG) module extracts relevant patient context from a vast pool of electronic health records (EHRs). Second, a prompt generation module constructs input prompts by integrating trial eligibility criteria (both inclusion and exclusion criteria), patient context, and system instructions. Third, a fine-tuning module with a classification head optimizes the model parameters using structured prompts and ground-truth labels. Fourth, an evaluation module assesses the fine-tuned model's performance on the testing datasets. We evaluated LLM-Match on four open datasets - n2c2, SIGIR, TREC 2021, and TREC 2022 - using open-source models, comparing it against TrialGPT, Zero-Shot, and GPT-4-based closed models. LLM-Match outperformed all baselines.
arXiv.org Artificial Intelligence
Mar-18-2025
- Country:
- Asia > Taiwan
- North America > United States
- Minnesota > Olmsted County
- Rochester (0.05)
- Wisconsin > La Crosse County
- La Crosse (0.04)
- Minnesota > Olmsted County
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Technology: