Surpassing GPT-4 Medical Coding with a Two-Stage Approach
Yang, Zhichao, Batra, Sanjit Singh, Stremmel, Joel, Halperin, Eran
–arXiv.org Artificial Intelligence
Recent advances in large language models (LLMs) show potential for clinical applications, such as clinical decision support and trial recommendations. However, the GPT-4 LLM predicts an excessive number of ICD codes for medical coding tasks, leading to high recall but low precision. To tackle this challenge, we introduce LLM-codex, a two-stage approach to predict ICD codes that first generates evidence proposals using an LLM and then employs an LSTM-based verification stage. The LSTM learns from both the LLM's high recall and human expert's high precision, using a custom loss function. Our model is the only approach that simultaneously achieves state-of-the-art results in medical coding accuracy, accuracy on rare codes, and sentence-level evidence identification to support coding decisions without training on human-annotated evidence according to experiments on the MIMIC dataset.
arXiv.org Artificial Intelligence
Nov-22-2023
- Country:
- Asia
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Europe > Ireland
- Leinster > County Dublin > Dublin (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Massachusetts > Hampshire County
- Amherst (0.14)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Minnetonka (0.04)
- New York (0.04)
- Washington > King County
- Seattle (0.14)
- Louisiana > Orleans Parish
- Canada > Ontario
- Oceania > Australia
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine
- Health Care Providers & Services (1.00)
- Health Care Technology > Medical Record (0.69)
- Therapeutic Area
- Cardiology/Vascular Diseases (0.68)
- Hematology (0.47)
- Oncology (0.46)
- Health & Medicine
- Technology: