OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records

Wan, Zhijiang, Yu, Qianhao, Mao, Jia, Duan, Wenfeng, Ding, Cheng

arXiv.org Artificial Intelligence 

-- This study introduces OpenECG, a large-scale benchmark of 1.2 million 12-lead ECG recordings from nine centers, to evaluate ECG foundation models (ECG-FMs) trained on public datasets. We investigate three self-supervised learning methods (SimCLR, BYOL, MAE) with ResNet-50 and Vision Transformer architectures, assessing model generalization through leave-one-dataset-out experiments and data scaling analysis. Results show that pre-training on diverse datasets significantly improves generalization, with BYOL and MAE outperforming SimCLR, highlighting the efficacy of feature-consistency and generative learning over contrastive approaches. Data scaling experiments reveal that performance saturates at 60-70% of total data for BYOL and MAE, while SimCLR requires more data. These findings demonstrate that publicly available ECG data can match or surpass proprietary datasets in training robust ECG-FMs, paving the way for scalable, clinically meaningful AI-driven ECG analysis. Electrocardiography (ECG) is a fundamental tool for diagnosing cardiovascular diseases (CVDs), which are among the leading causes of mortality worldwide. ECG enables clinicians to detect arrhythmias, myocardial infarction, and other heart conditions(Moreno-S anchez et al. 2024). Despite its importance, several challenges hinder the effective utilization of ECG in clinical practice: First, the diagnostic accuracy can differ significantly among cardiologists due to varying levels of training and experience.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found