Improving Drug Identification in Overdose Death Surveillance using Large Language Models
Funnell, Arthur J., Petousis, Panayiotis, Harel-Canada, Fabrice, Romero, Ruby, Bui, Alex A. T., Koncsol, Adam, Chaturvedi, Hritika, Shover, Chelsea, Goodman-Meza, David
–arXiv.org Artificial Intelligence
The rising rate of drug-related deaths in the United States, largely driven by fentanyl, requires timely and accurate surveillance. However, critical overdose data are often buried in free-text coroner reports, leading to delays and information loss when coded into ICD (International Classification of Disease)-10 classifications. Natural language processing (NLP) models may automate and enhance overdose surveillance, but prior applications have been limited. A dataset of 35,433 death records from multiple U.S. jurisdictions in 2020 was used for model training and internal testing. External validation was conducted using a novel separate dataset of 3,335 records from 2023-2024. Multiple NLP approaches were evaluated for classifying specific drug involvement from unstructured death certificate text. These included traditional single- and multi-label classifiers, as well as fine-tuned encoder-only language models such as Bidirectional Encoder Representations from Transformers (BERT) and BioClinicalBERT, and contemporary decoder-only large language models such as Qwen 3 and Llama 3. Model performance was assessed using macro-averaged F1 scores, and 95% confidence intervals were calculated to quantify uncertainty. Fine-tuned BioClinicalBERT models achieved near-perfect performance, with macro F1 scores >=0.998 on the internal test set. External validation confirmed robustness (macro F1=0.966), outperforming conventional machine learning, general-domain BERT models, and various decoder-only large language models. NLP models, particularly fine-tuned clinical variants like BioClinicalBERT, offer a highly accurate and scalable solution for overdose death classification from free-text reports. These methods can significantly accelerate surveillance workflows, overcoming the limitations of manual ICD-10 coding and supporting near real-time detection of emerging substance use trends.
arXiv.org Artificial Intelligence
Jul-18-2025
- Country:
- North America
- Canada (0.05)
- United States
- Alabama > Jefferson County (0.04)
- California
- Los Angeles County > Los Angeles (0.29)
- San Diego County > San Diego (0.04)
- Connecticut
- Fairfield County > Fairfield (0.04)
- Hartford County > Hartford (0.14)
- New Haven County > New Haven (0.14)
- New London County > New London (0.04)
- Illinois > Cook County (0.04)
- New Jersey (0.04)
- North Carolina (0.04)
- Texas
- Denton County (0.04)
- Duval County > San Diego (0.04)
- Johnson County (0.04)
- Marion County > Jefferson (0.04)
- Parker County (0.04)
- Tarrant County (0.04)
- Wisconsin > Milwaukee County
- Milwaukee (0.04)
- Oceania > Australia
- New South Wales (0.04)
- North America
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Technology: