MIMIC-SR-ICD11: A Dataset for Narrative-Based Diagnosis
Wu, Yuexin, Wang, Shiqi, Rus, Vasile
–arXiv.org Artificial Intelligence
Disease diagnosis is a central pillar of modern healthcare, enabling early detection and timely intervention for acute conditions while guiding lifestyle adjustments and medication regimens to prevent or slow chronic disease. Self-reports preserve clinically salient signals that templated electronic health record (EHR) documentation often attenuates or omits, especially subtle but consequential details. To operationalize this shift, we introduce MIMIC-SR-ICD11, a large English diagnostic dataset built from EHR discharge notes and natively aligned to WHO ICD-11 terminology. We further present LL-Rank, a likelihood-based re-ranking framework that computes a length-normalized joint likelihood of each label given the clinical report context and subtracts the corresponding report-free prior likelihood for that label. Across seven model backbones, LL-Rank consistently outperforms a strong generation-plus-mapping baseline (GenMap). Ablation experiments show that LL-Rank's gains primarily stem from its PMI-based scoring, which isolates semantic compatibility from label frequency bias.
arXiv.org Artificial Intelligence
Nov-10-2025
- Country:
- Asia
- China > Guangdong Province
- Guangzhou (0.04)
- Middle East
- Israel (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.14)
- China > Guangdong Province
- North America > United States
- New York > New York County > New York City (0.04)
- Oceania > Australia
- Asia
- Genre:
- Research Report
- Experimental Study (0.46)
- New Finding (0.46)
- Research Report
- Industry:
- Health & Medicine
- Diagnostic Medicine (1.00)
- Health Care Providers & Services (1.00)
- Health Care Technology > Medical Record (1.00)
- Therapeutic Area
- Psychiatry/Psychology (0.93)
- Endocrinology (0.68)
- Pulmonary/Respiratory Diseases (1.00)
- Nephrology (1.00)
- Neurology (0.93)
- Immunology (1.00)
- Cardiology/Vascular Diseases (1.00)
- Hematology (0.68)
- Gastroenterology (1.00)
- Infections and Infectious Diseases (1.00)
- Health & Medicine
- Technology: