Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs

Liu, Che, Ouyang, Cheng, Wan, Zhongwei, Wang, Haozhe, Bai, Wenjia, Arcucci, Rossella

Feb-25-2025–arXiv.org Artificial Intelligence

Recent advances in multimodal ECG representation learning center on aligning ECG signals with paired free-text reports. However, suboptimal alignment persists due to the complexity of medical language and the reliance on a full 12-lead setup, which is often unavailable in under-resourced settings. To tackle these issues, we propose **K-MERL**, a knowledge-enhanced multimodal ECG representation learning framework. **K-MERL** leverages large language models to extract structured knowledge from free-text reports and employs a lead-aware ECG encoder with dynamic lead masking to accommodate arbitrary lead inputs. Evaluations on six external ECG datasets show that **K-MERL** achieves state-of-the-art performance in zero-shot classification and linear probing tasks, while delivering an average **16%** AUC improvement over existing methods in partial-lead zero-shot classification.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Feb-25-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.28)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.14)
- North America > United States (0.28)

Genre:
- Research Report (0.82)

Industry:
- Health & Medicine
  - Diagnostic Medicine (1.00)
  - Health Care Technology (1.00)
  - Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.93)
  - Natural Language > Large Language Model (1.00)