OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records
Wan, Zhijiang, Yu, Qianhao, Mao, Jia, Duan, Wenfeng, Ding, Cheng
–arXiv.org Artificial Intelligence
-- This study introduces OpenECG, a large-scale benchmark of 1.2 million 12-lead ECG recordings from nine centers, to evaluate ECG foundation models (ECG-FMs) trained on public datasets. We investigate three self-supervised learning methods (SimCLR, BYOL, MAE) with ResNet-50 and Vision Transformer architectures, assessing model generalization through leave-one-dataset-out experiments and data scaling analysis. Results show that pre-training on diverse datasets significantly improves generalization, with BYOL and MAE outperforming SimCLR, highlighting the efficacy of feature-consistency and generative learning over contrastive approaches. Data scaling experiments reveal that performance saturates at 60-70% of total data for BYOL and MAE, while SimCLR requires more data. These findings demonstrate that publicly available ECG data can match or surpass proprietary datasets in training robust ECG-FMs, paving the way for scalable, clinically meaningful AI-driven ECG analysis. Electrocardiography (ECG) is a fundamental tool for diagnosing cardiovascular diseases (CVDs), which are among the leading causes of mortality worldwide. ECG enables clinicians to detect arrhythmias, myocardial infarction, and other heart conditions(Moreno-S anchez et al. 2024). Despite its importance, several challenges hinder the effective utilization of ECG in clinical practice: First, the diagnostic accuracy can differ significantly among cardiologists due to varying levels of training and experience.
arXiv.org Artificial Intelligence
Mar-1-2025
- Country:
- North America > United States (0.04)
- South America > Brazil
- Minas Gerais (0.04)
- Asia > China
- Jiangxi Province > Nanchang (0.05)
- Zhejiang Province > Ningbo (0.04)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Technology: