Goto

Collaborating Authors

 fibrillation


Motion-Robust Multimodal Fusion of PPG and Accelerometer Signals for Three-Class Heart Rhythm Classification

Zhao, Yangyang, Kaisti, Matti, Lahdenoja, Olli, Koivisto, Tero

arXiv.org Artificial Intelligence

Atrial fibrillation (AF) is a leading cause of stroke and mortality, particularly in elderly patients. Wrist-worn photoplethysmography (PPG) enables non-invasive, continuous rhythm monitoring, yet suffers from significant vulnerability to motion artifacts and physiological noise. Many existing approaches rely solely on single-channel PPG and are limited to binary AF detection, often failing to capture the broader range of arrhythmias encountered in clinical settings. We introduce RhythmiNet, a residual neural network enhanced with temporal and channel attention modules that jointly leverage PPG and accelerometer (ACC) signals. The model performs three-class rhythm classification: AF, sinus rhythm (SR), and Other. To assess robustness across varying movement conditions, test data are stratified by accelerometer-based motion intensity percentiles without excluding any segments. RhythmiNet achieved a 4.3% improvement in macro-AUC over the PPG-only baseline. In addition, performance surpassed a logistic regression model based on handcrafted HRV features by 12%, highlighting the benefit of multimodal fusion and attention-based learning in noisy, real-world clinical data.


CLARAE: Clarity Preserving Reconstruction AutoEncoder for Denoising and Rhythm Classification of Intracardiac Electrograms

Lin, Long, Peiro-Corbacho, Pablo, Ávila, Pablo, Carta-Bergaz, Alejandro, Arenal, Ángel, Ríos-Muñoz, Gonzalo R., Sevilla-Salcedo, Carlos

arXiv.org Artificial Intelligence

Intracavitary atrial electrograms (EGMs) provide high-resolution insights into cardiac electrophysiology but are often contaminated by noise and remain high-dimensional, limiting real-time analysis. We introduce CLARAE (CLArity-preserving Reconstruction AutoEncoder), a one-dimensional encoder--decoder designed for atrial EGMs, which achieves both high-fidelity reconstruction and a compact 64-dimensional latent representation. CLARAE is designed to preserve waveform morphology, mitigate reconstruction artifacts, and produce interpretable embeddings through three principles: downsampling with pooling, a hybrid interpolation--convolution upsampling path, and a bounded latent space. We evaluated CLARAE on 495,731 EGM segments (unipolar and bipolar) from 29 patients across three rhythm types (AF, SR300, SR600). Performance was benchmarked against six state-of-the-art autoencoders using reconstruction metrics, rhythm classification, and robustness across signal-to-noise ratios from -5 to 15 dB. In downstream rhythm classification, CLARAE achieved F1-scores above 0.97 for all rhythm types, and its latent space showed clear clustering by rhythm. In denoising tasks, it consistently ranked among the top performers for both unipolar and bipolar signals. In order to promote reproducibility and enhance accessibility, we offer an interactive web-based application. This platform enables users to explore pre-trained CLARAE models, visualize the reconstructions, and compute metrics in real time. Overall, CLARAE combines robust denoising with compact, discriminative representations, offering a practical foundation for clinical workflows such as rhythm discrimination, signal quality assessment, and real-time mapping.


Reconstructing 12-Lead ECG from 3-Lead ECG using Variational Autoencoder to Improve Cardiac Disease Detection of Wearable ECG Devices

Guan, Xinyan, Lai, Yongfan, Jin, Jiarui, Li, Jun, Wang, Haoyu, Zhao, Qinghao, Zhang, Deyun, Geng, Shijia, Hong, Shenda

arXiv.org Artificial Intelligence

Twelve-lead electrocardiograms (ECGs) are the clinical gold standard for cardiac diagnosis, providing comprehensive spatial coverage of the heart necessary to detect conditions such as myocardial infarction (MI). However, their lack of portability limits continuous and large-scale use. Three-lead ECG systems are widely used in wearable devices due to their simplicity and mobility, but they often fail to capture pathologies in unmeasured regions. To address this, we propose WearECG, a Variational Autoencoder (VAE) method that reconstructs twelve-lead ECGs from three leads: II, V1, and V5. Our model includes architectural improvements to better capture temporal and spatial dependencies in ECG signals. We evaluate generation quality using MSE, MAE, and Frechet Inception Distance (FID), and assess clinical validity via a Turing test with expert cardiologists. To further validate diagnostic utility, we fine-tune ECGFounder, a large-scale pretrained ECG model, on a multi-label classification task involving over 40 cardiac conditions, including six different myocardial infarction locations, using both real and generated signals. Experiments on the MIMIC dataset show that our method produces physiologically realistic and diagnostically informative signals, with robust performance in downstream tasks. This work demonstrates the potential of generative modeling for ECG reconstruction and its implications for scalable, low-cost cardiac screening.


DeepBoost-AF: A Novel Unsupervised Feature Learning and Gradient Boosting Fusion for Robust Atrial Fibrillation Detection in Raw ECG Signals

Jafari, Alireza, Yousefirizi, Fereshteh, Seydi, Vahid

arXiv.org Artificial Intelligence

Atrial fibrillation (AF) is a prevalent cardiac arrhythmia associated with elevated health risks, where timely detection is pivotal for mitigating stroke-related morbidity. This study introduces an innovative hybrid methodology integrating unsupervised deep learning and gradient boosting models to improve AF detection. A 19-layer deep convolutional autoencoder (DCAE) is coupled with three boosting classifiers-AdaBoost, XGBoost, and LightGBM (LGBM)-to harness their complementary advantages while addressing individual limitations. The proposed framework uniquely combines DCAE with gradient boosting, enabling end-to-end AF identification devoid of manual feature extraction. The DCAE-LGBM model attains an F1-score of 95.20%, sensitivity of 99.99%, and inference latency of four seconds, outperforming existing methods and aligning with clinical deployment requirements. The DCAE integration significantly enhances boosting models, positioning this hybrid system as a reliable tool for automated AF detection in clinical settings.


Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images

Zhang, Shanwei, Zhang, Deyun, Tao, Yirao, Wang, Kexin, Geng, Shijia, Li, Jun, Zhao, Qinghao, Liu, Xingpeng, Zhou, Yuxi, Hong, Shenda

arXiv.org Artificial Intelligence

Electrocardiogram (ECG) as an important tool for diagnosing cardiovascular diseases such as arrhythmia. Due to the differences in ECG layouts used by different hospitals, the digitized signals exhibit asynchronous lead time and partial blackout loss, which poses a serious challenge to existing models. To address this challenge, the study introduced PatchECG, a framework for adaptive variable block count missing representation learning based on a masking training strategy, which automatically focuses on key patches with collaborative dependencies between leads, thereby achieving key recognition of arrhythmia in ECGs with different layouts. Experiments were conducted on the PTB-XL dataset and 21388 asynchronous ECG images generated using ECG image kit tool, using the 23 Subclasses as labels. The proposed method demonstrated strong robustness under different layouts, with average Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.835 and remained stable (unchanged with layout changes). In external validation based on 400 real ECG images data from Chaoyang Hospital, the AUROC for atrial fibrillation diagnosis reached 0.778; On 12 x 1 layout ECGs, AUROC reaches 0.893. This result is superior to various classic interpolation and baseline methods, and compared to the current optimal large-scale pre-training model ECGFounder, it has improved by 0.111 and 0.19.


SOFA: Deep Learning Framework for Simulating and Optimizing Atrial Fibrillation Ablation

Chung, Yunsung, Lim, Chanho, Bidaoui, Ghassan, Massad, Christian, Marrouche, Nassir, Hamm, Jihun

arXiv.org Artificial Intelligence

Atrial fibrillation (AF) is a prevalent cardiac arrhythmia often treated with catheter ablation procedures, but procedural outcomes are highly variable. Evaluating and improving ablation efficacy is challenging due to the complex interaction between patient-specific tissue and procedural factors. This paper asks two questions: Can AF recurrence be predicted by simulating the effects of procedural parameters? How should we ablate to reduce AF recurrence? We propose SOFA (Simulating and Optimizing Atrial Fibrillation Ablation), a novel deep-learning framework that addresses these questions. SOFA first simulates the outcome of an ablation strategy by generating a post-ablation image depicting scar formation, conditioned on a patient's pre-ablation LGE-MRI and the specific procedural parameters used (e.g., ablation locations, duration, temperature, power, and force). During this simulation, it predicts AF recurrence risk. Critically, SOFA then introduces an optimization scheme that refines these procedural parameters to minimize the predicted risk. Our method leverages a multi-modal, multi-view generator that processes 2.5D representations of the atrium. Quantitative evaluations show that SOFA accurately synthesizes post-ablation images and that our optimization scheme leads to a 22.18\% reduction in the model-predicted recurrence risk. To the best of our knowledge, SOFA is the first framework to integrate the simulation of procedural effects, recurrence prediction, and parameter optimization, offering a novel tool for personalizing AF ablation.


Online Graph Learning via Time-Vertex Adaptive Filters: From Theory to Cardiac Fibrillation

Jenkins, Alexander, Variddhisai, Thiernithi, El-Medany, Ahmed, Ng, Fu Siong, Mandic, Danilo

arXiv.org Machine Learning

Graph Signal Processing (GSP) provides a powerful framework for analysing complex, interconnected systems by modelling data as signals on graphs. Recent advances in GSP have enabled the learning of graph structures from observed signals, but these methods often struggle with time-varying systems and real-time applications. Adaptive filtering techniques, while effective for online learning, have seen limited application in graph topology estimation from a GSP perspective. To this end, we introduce AdaCGP, an online algorithm for adaptive estimation of the Graph Shift Operator (GSO) from multivariate time series. The GSO is estimated from an adaptive time-vertex autoregressive model through recursive update formulae designed to address sparsity, shift-invariance and bias. Through simulations, we show that AdaCGP performs consistently well across various graph topologies, and achieves improvements in excess of 82% for GSO estimation compared to baseline adaptive vector autoregressive models. In addition, our online variable splitting approach for enforcing sparsity enables near-perfect precision in identifying causal connections while maintaining low false positive rates upon optimisation of the forecast error. Finally, AdaCGP's ability to track changes in graph structure is demonstrated on recordings of ventricular fibrillation dynamics in response to an anti-arrhythmic drug. AdaCGP is shown to be able to identify the stability of critical conduction patterns that may be maintaining the arrhythmia in an intuitive way, together with its potential to support diagnosis and treatment strategies.


Searching for Best Practices in Medical Transcription with Large Language Model

Li, Jiafeng, Mu, Yanda

arXiv.org Artificial Intelligence

The transcription of medical monologues, especially those containing a high density of specialized terminology and delivered with a distinct accent, presents a significant challenge for existing automated systems. This paper introduces a novel approach leveraging a Large Language Model (LLM) to generate highly accurate medical transcripts from audio recordings of doctors' monologues, specifically focusing on Indian accents. Our methodology integrates advanced language modeling techniques to lower the Word Error Rate (WER) and ensure the precise recognition of critical medical terms. Through rigorous testing on a comprehensive dataset of medical recordings, our approach demonstrates substantial improvements in both overall transcription accuracy and the fidelity of key medical terminologies. These results suggest that our proposed system could significantly aid in clinical documentation processes, offering a reliable tool for healthcare providers to streamline their transcription needs while maintaining high standards of accuracy.


Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life Settings

Han, Dong, Moon, Jihye, Díaz, Luís Roberto Mercado, Chen, Darren, Williams, Devan, Ding, Eric Y., Tran, Khanh-Van, McManus, David D., Chon, Ki H.

arXiv.org Artificial Intelligence

Most deep learning models of multiclass arrhythmia classification are tested on fingertip photoplethysmographic (PPG) data, which has higher signal-to-noise ratios compared to smartwatch-derived PPG, and the best reported sensitivity value for premature atrial/ventricular contraction (PAC/PVC) detection is only 75%. To improve upon PAC/PVC detection sensitivity while maintaining high AF detection, we use multi-modal data which incorporates 1D PPG, accelerometers, and heart rate data as the inputs to a computationally efficient 1D bi-directional Gated Recurrent Unit (1D-Bi-GRU) model to detect three arrhythmia classes. We used motion-artifact prone smartwatch PPG data from the NIH-funded Pulsewatch clinical trial. Our multimodal model tested on 72 subjects achieved an unprecedented 83% sensitivity for PAC/PVC detection while maintaining a high accuracy of 97.31% for AF detection. These results outperformed the best state-of-the-art model by 20.81% for PAC/PVC and 2.55% for AF detection even while our model was computationally more efficient (14 times lighter and 2.7 faster).


SHDB-AF: a Japanese Holter ECG database of atrial fibrillation

Tsutsui, Kenta, Brimer, Shany Biton, Ben-Moshe, Noam, Sellal, Jean Marc, Oster, Julien, Mori, Hitoshi, Ikeda, Yoshifumi, Arai, Takahide, Nakano, Shintaro, Kato, Ritsushi, Behar, Joachim A.

arXiv.org Artificial Intelligence

Atrial fibrillation (AF) is a common atrial arrhythmia that impairs quality of life and causes embolic stroke, heart failure and other complications. Recent advancements in machine learning (ML) and deep learning (DL) have shown potential for enhancing diagnostic accuracy. It is essential for DL models to be robust and generalizable across variations in ethnicity, age, sex, and other factors. Although a number of ECG database have been made available to the research community, none 1 includes a Japanese population sample. Saitama Heart Database Atrial Fibrillation (SHDB-AF) is a novel open-sourced Holter ECG database from Japan, containing data from 100 unique patients with paroxysmal AF. Each record in SHDB-AF is 24 hours long and sampled at 200 Hz, totaling 24 million seconds of ECG data.