Secondary Use of Clinical Problem List Entries for Neural Network-Based Disease Code Assignment

Kreuzthaler, Markus, Pfeifer, Bastian, Kramer, Diether, Schulz, Stefan

May-19-2023–arXiv.org Artificial Intelligence

Clinical information systems have become large repositories for semi-structured and partly annotated electronic health record data, which have reached a critical mass that makes them interesting for supervised data-driven neural network approaches. We explored automated coding of 50 character long clinical problem list entries using the International Classification of Diseases (ICD-10) and evaluated three different types of network architectures on the top 100 ICD-10 three-digit codes. A fastText baseline reached a macro-averaged F1-score of 0.83, followed by a character-level LSTM with a macro-averaged F1-score of 0.84. The top performing approach used a downstreamed RoBERTa model with a custom language model, yielding a macro-averaged F1-score of 0.88. A neural network activation analysis together with an investigation of the false positives and false negatives unveiled inconsistent manual coding as a main limiting factor.

machine learning, macro-averaged f1-score, natural language, (11 more...)

arXiv.org Artificial Intelligence

May-19-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Austria (0.16)

Genre:
- Research Report (0.83)

Industry:
- Health & Medicine
  - Health Care Providers & Services (0.50)
  - Health Care Technology > Medical Record (0.55)
  - Therapeutic Area > Endocrinology
    - Diabetes (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found