OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records

Wan, Zhijiang, Yu, Qianhao, Mao, Jia, Duan, Wenfeng, Ding, Cheng

Mar-1-2025–arXiv.org Artificial Intelligence

-- This study introduces OpenECG, a large-scale benchmark of 1.2 million 12-lead ECG recordings from nine centers, to evaluate ECG foundation models (ECG-FMs) trained on public datasets. We investigate three self-supervised learning methods (SimCLR, BYOL, MAE) with ResNet-50 and Vision Transformer architectures, assessing model generalization through leave-one-dataset-out experiments and data scaling analysis. Results show that pre-training on diverse datasets significantly improves generalization, with BYOL and MAE outperforming SimCLR, highlighting the efficacy of feature-consistency and generative learning over contrastive approaches. Data scaling experiments reveal that performance saturates at 60-70% of total data for BYOL and MAE, while SimCLR requires more data. These findings demonstrate that publicly available ECG data can match or surpass proprietary datasets in training robust ECG-FMs, paving the way for scalable, clinically meaningful AI-driven ECG analysis. Electrocardiography (ECG) is a fundamental tool for diagnosing cardiovascular diseases (CVDs), which are among the leading causes of mortality worldwide. ECG enables clinicians to detect arrhythmias, myocardial infarction, and other heart conditions(Moreno-S anchez et al. 2024). Despite its importance, several challenges hinder the effective utilization of ECG in clinical practice: First, the diagnostic accuracy can differ significantly among cardiologists due to varying levels of training and experience.

dataset, ecg foundation model, foundation model, (13 more...)

arXiv.org Artificial Intelligence

Mar-1-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- South America > Brazil
  - Minas Gerais (0.04)
- Asia > China
  - Jiangxi Province > Nanchang (0.05)
  - Zhejiang Province > Ningbo (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Health & Medicine
  - Therapeutic Area > Cardiology/Vascular Diseases (1.00)
  - Diagnostic Medicine (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found