A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning

Guler, Berkay, Geraci, Giovanni, Jafarkhani, Hamid

arXiv.org Artificial Intelligence 

This work has been submitted to the IEEE for possible publication. Abstract--Current applications of self-supervised learning to wireless channel representation often borrow paradigms developed for text and image processing, without fully addressing the unique characteristics and constraints of wireless communications. T o bridge this gap, we introduce ContraWiMAE, Wireless Contrastive Masked Autoencoder, a transformer-based foundation model that unifies masked reconstruction and masked contrastive learning for wireless channel representation. Our key innovation is a new wireless-inspired contrastive objective that exploits the inherent characteristics of wireless environment, including noise, fading, and partial observability, as natural augmentation. Through extensive evaluation on unseen scenarios and conditions, we demonstrate our method's effectiveness in multiple downstream tasks, including cross-frequency beam selection, line-of-sight detection, and channel estimation. ContraWiMAE exhibits superior linear separability and adaptability in diverse wireless environments, demonstrating exceptional data efficiency and competitive performance compared with supervised baselines under challenging conditions. Comparative evaluations against a state-of-the-art wireless channel foundation model confirm the superior performance and data efficiency of our approach, highlighting its potential as a powerful baseline for future research in self-supervised wireless channel representation learning. T o foster further work in this direction, we release the model weights and training pipeline for ContraWiMAE. Large-scale self-supervised pretraining has transformed the fields of natural language processing and computer vision. This paradigm leverages diverse datasets and proxy objectives to learn broadly transferable representations, in contrast to traditional task-specific training approaches [2]-[4]. By de-coupling feature learning from downstream tasks, it enables efficient, task-specific adaptation. Models following this two-stage strategy--computationally intensive pretraining followed by lightweight adaptation--are commonly referred to as foundation models [5].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found