Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs
Liu, Che, Ouyang, Cheng, Wan, Zhongwei, Wang, Haozhe, Bai, Wenjia, Arcucci, Rossella
–arXiv.org Artificial Intelligence
Recent advances in multimodal ECG representation learning center on aligning ECG signals with paired free-text reports. However, suboptimal alignment persists due to the complexity of medical language and the reliance on a full 12-lead setup, which is often unavailable in under-resourced settings. To tackle these issues, we propose **K-MERL**, a knowledge-enhanced multimodal ECG representation learning framework. **K-MERL** leverages large language models to extract structured knowledge from free-text reports and employs a lead-aware ECG encoder with dynamic lead masking to accommodate arbitrary lead inputs. Evaluations on six external ECG datasets show that **K-MERL** achieves state-of-the-art performance in zero-shot classification and linear probing tasks, while delivering an average **16%** AUC improvement over existing methods in partial-lead zero-shot classification.
arXiv.org Artificial Intelligence
Feb-25-2025
- Country:
- Asia > China (0.28)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.14)
- North America > United States (0.28)
- Genre:
- Research Report (0.82)
- Industry:
- Technology: