Goto

Collaborating Authors

 sleep staging



Resource Efficient Sleep Staging via Multi-Level Masking and Prompt Learning

Ai, Lejun, Li, Yulong, Yi, Haodong, Xie, Jixuan, Wang, Yue, Liu, Jia, Chen, Min, Wang, Rui

arXiv.org Artificial Intelligence

Automatic sleep staging plays a vital role in assessing sleep quality and diagnosing sleep disorders. Most existing methods rely heavily on long and continuous EEG recordings, which poses significant challenges for data acquisition in resource-constrained systems, such as wearable or home-based monitoring systems. In this paper, we propose the task of resource-efficient sleep staging, which aims to reduce the amount of signal collected per sleep epoch while maintaining reliable classification performance. To solve this task, we adopt the masking and prompt learning strategy and propose a novel framework called Mask-A ware Sleep Staging (MASS). Specifically, we design a multi-level masking strategy to promote effective feature modeling under partial and irregular observations. To mitigate the loss of contextual information introduced by masking, we further propose a hierarchical prompt learning mechanism that aggregates unmasked data into a global prompt, serving as a semantic anchor for guiding both patch-level and epoch-level feature modeling. MASS is evaluated on four datasets, demonstrating state-of-the-art performance, especially when the amount of data is very limited. This result highlights its potential for efficient and scalable deployment in real-world low-resource sleep monitoring environments.




NeuroLingua: A Language-Inspired Hierarchical Framework for Multimodal Sleep Stage Classification Using EEG and EOG

Samaee, Mahdi, Yazdi, Mehran, Massicotte, Daniel

arXiv.org Artificial Intelligence

We propose NeuroLingua, a language - inspired framework that conceptualizes sleep as a structured physiological language. Each 30 - second epoch is decomposed into overlapping 3 - second subwindows ("tokens") using a CNN - based tokenizer, enabling hierarchical temporal modeling through dual - level Transformers: intra - segment encoding of local dependencies and inter - segment integration across seven consecutive epochs (3.5 minutes) for extended context. Modality - specific embeddings from EEG and EOG channels are fused via a Graph Convolutional Network, facilitating robust multimodal integration. NeuroLingua is evaluated on the Sleep - EDF Expanded and ISRUC - Sleep datasets, achieving state - of - the - art results on Sleep - EDF (85.3% accuracy, 0.800 macro F1, and 0.796 Cohen's κ), and competitive performance on ISRUC (81.9% accuracy, 0.802 macro F1, and 0.755 κ), matching or exceeding published baselines in overall and per - class metrics. The architecture's attentio n mechanisms enhance the detection of clinically relevant sleep microevents, providing a principled foundation for future interpretability, explainability and causal inference in sleep research. By framing sleep as a compositional language, NeuroLingua uni fies hierarchical sequence modeling and multimodal fusion, advancing automated sleep staging toward more transparent and clinically meaningful applications. Index Terms -- Sleep staging, EEG, EOG, Polysomnography, Deep learning, Hierarchical sequence modeling, Multimodal fusion, Transformers, Graph neural networks, Interpretability, Explainability, Causal inference.


Transformer-Based Sleep Stage Classification Enhanced by Clinical Information

Chung, Woosuk, Hong, Seokwoo, Lee, Wonhyeok, Bae, Sangyoon

arXiv.org Artificial Intelligence

Manual sleep staging from polysomnography (PSG) is labor-intensive and prone to inter-scorer variability. While recent deep learning models have advanced automated staging, most rely solely on raw PSG signals and neglect contextual cues used by human experts. We propose a two-stage architecture that combines a Transformer-based per-epoch encoder with a 1D CNN aggregator, and systematically investigates the effect of incorporating explicit context: subject-level clinical metadata (age, sex, BMI) and per-epoch expert event annotations (apneas, desaturations, arousals, periodic breathing). Using the Sleep Heart Health Study (SHHS) cohort (n=8,357), we demonstrate that contextual fusion substantially improves staging accuracy. Compared to a PSG-only baseline (macro-F1 0.7745, micro-F1 0.8774), our final model achieves macro-F1 0.8031 and micro-F1 0.9051, with event annotations contributing the largest gains. Notably, feature fusion outperforms multi-task alternatives that predict the same auxiliary labels. These results highlight that augmenting learned representations with clinically meaningful features enhances both performance and interpretability, without modifying the PSG montage or requiring additional sensors. Our findings support a practical and scalable path toward context-aware, expert-aligned sleep staging systems.


A Systematic Evaluation of Self-Supervised Learning for Label-Efficient Sleep Staging with Wearable EEG

Estevan, Emilio, Sierra-Torralba, María, López-Larraz, Eduardo, Montesano, Luis

arXiv.org Artificial Intelligence

Abstract--Wearable EEG devices have emerged as a promising alternative to polysomnography (PSG). As affordable and scalable solutions, their widespread adoption results in the collection of massive volumes of unlabeled data that cannot be analyzed by clinicians at scale. Meanwhile, the recent success of deep learning for sleep scoring has relied on large annotated datasets. Self-supervised learning (SSL) offers an opportunity to bridge this gap, leveraging unlabeled signals to address label scarcity and reduce annotation effort. In this paper, we present the first systematic evaluation of SSL for sleep staging using wearable EEG. We investigate a range of well-established SSL methods and evaluate them on two sleep databases acquired with the Ikon Sleep wearable EEG headband: BOAS, a high-quality benchmark containing PSG and wearable EEG recordings with consensus labels, and HOGAR, a large collection of home-based, self-recorded, and unlabeled recordings. Three evaluation scenarios are defined to study label efficiency, representation quality, and cross-dataset generalization. Results show that SSL consistently improves classification performance by up to 10% over supervised baselines, with gains particularly evident when labeled data is scarce. SSL achieves clinical-grade accuracy above 80% leveraging only 5% to 10% of labeled data, while the supervised approach requires twice the labels. Additionally, SSL representations prove robust to variations in population characteristics, recording environments, and signal quality . Our findings demonstrate the potential of SSL to enable label-efficient sleep staging with wearable EEG, reducing reliance on manual annotations and advancing the development of affordable sleep monitoring systems.


NAP: Attention-Based Late Fusion for Automatic Sleep Staging

Rossi, Alvise Dei, van der Meer, Julia, Schmidt, Markus H., Bassetti, Claudio L. A., Fiorillo, Luigi, Faraci, Francesca

arXiv.org Artificial Intelligence

Polysomnography signals are highly heterogeneous, varying in modality composition (e.g., EEG, EOG, ECG), channel availability (e.g., frontal, occipital EEG), and acquisition protocols across datasets and clinical sites. Most existing models that process polysomnography data rely on a fixed subset of modalities or channels and therefore neglect to fully exploit its inherently multimodal nature. We address this limitation by introducing NAP (Neural Aggregator of Predictions), an attention-based model which learns to combine multiple prediction streams using a tri-axial attention mechanism that captures temporal, spatial, and predictor-level dependencies. NAP is trained to adapt to different input dimensions. By aggregating outputs from frozen, pretrained single-channel models, NAP consistently outperforms individual predictors and simple ensembles, achieving state-of-the-art zero-shot generalization across multiple datasets. While demonstrated in the context of automated sleep staging from polysomnography, the proposed approach could be extended to other multimodal physiological applications.


Consciousness-ECG Transformer for Conscious State Estimation System with Real-Time Monitoring

Kweon, Young-Seok, Shin, Gi-Hwan, Kim, Ji-Yong, Ryu, Bokyeong, Lee, Seong-Whan

arXiv.org Artificial Intelligence

Conscious state estimation is important in various medical settings, including sleep staging and anesthesia management, to ensure patient safety and optimize health outcomes. Traditional methods predominantly utilize electroencephalography (EEG), which faces challenges such as high sensitivity to noise and the requirement for controlled environments. In this study, we propose the consciousness-ECG transformer that leverages electrocardiography (ECG) signals for non-invasive and reliable conscious state estimation. Our approach employs a transformer with decoupled query attention to effectively capture heart rate variability features that distinguish between conscious and unconscious states. We implemented the conscious state estimation system with real-time monitoring and validated our system on datasets involving sleep staging and anesthesia level monitoring during surgeries. Experimental results demonstrate that our model outperforms baseline models, achieving accuracies of 0.877 on sleep staging and 0.880 on anesthesia level monitoring. Moreover, our model achieves the highest area under curve values of 0.786 and 0.895 on sleep staging and anesthesia level monitoring, respectively. The proposed system offers a practical and robust alternative to EEG-based methods, particularly suited for dynamic clinical environments. Our results highlight the potential of ECG-based consciousness monitoring to enhance patient safety and advance our understanding of conscious states.


Leveraging Generic Time Series Foundation Models for EEG Classification

Gnassounou, Théo, Moakher, Yessin, Xie, Shifeng, Feofanov, Vasilii, Redko, Ievgen

arXiv.org Artificial Intelligence

Foundation models for time series are emerging as powerful general-purpose backbones, yet their potential for domain-specific biomedical signals such as electroencephalography (EEG) remains rather unexplored. In this work, we investigate the applicability a recently proposed time series classification foundation model, to a different EEG tasks such as motor imagery classification and sleep stage prediction. We test two pretraining regimes: (a) pretraining on heterogeneous real-world time series from multiple domains, and (b) pretraining on purely synthetic data. We find that both variants yield strong performance, consistently outperforming EEGNet, a widely used convolutional baseline, and CBraMod, the most recent EEG-specific foundation model. These results suggest that generalist time series foundation models, even when pretrained on data of non-neural origin or on synthetic signals, can transfer effectively to EEG. Our findings highlight the promise of leveraging cross-domain pretrained models for brain signal analysis, suggesting that EEG may benefit from advances in the broader time series literature.