Goto

Collaborating Authors

 Wu, Yongpeng


Learnable Residual-based Latent Denoising in Semantic Communication

arXiv.org Artificial Intelligence

A latent denoising semantic communication (SemCom) framework is proposed for robust image transmission over noisy channels. By incorporating a learnable latent denoiser into the receiver, the received signals are preprocessed to effectively remove the channel noise and recover the semantic information, thereby enhancing the quality of the decoded images. Specifically, a latent denoising mapping is established by an iterative residual learning approach to improve the denoising efficiency while ensuring stable performance. Moreover, channel signal-to-noise ratio (SNR) is utilized to estimate and predict the latent similarity score (SS) for conditional denoising, where the number of denoising steps is adapted based on the predicted SS sequence, further reducing the communication latency. Finally, simulations demonstrate that the proposed framework can effectively and efficiently remove the channel noise at various levels and reconstruct visual-appealing images.


Multi-modal Iterative and Deep Fusion Frameworks for Enhanced Passive DOA Sensing via a Green Massive H2AD MIMO Receiver

arXiv.org Artificial Intelligence

Most existing DOA estimation methods assume ideal source incident angles with minimal noise. Moreover, directly using pre-estimated angles to calculate weighted coefficients can lead to performance loss. Thus, a green multi-modal (MM) fusion DOA framework is proposed to realize a more practical, low-cost and high time-efficiency DOA estimation for a H$^2$AD array. Firstly, two more efficient clustering methods, global maximum cos\_similarity clustering (GMaxCS) and global minimum distance clustering (GMinD), are presented to infer more precise true solutions from the candidate solution sets. Based on this, an iteration weighted fusion (IWF)-based method is introduced to iteratively update weighted fusion coefficients and the clustering center of the true solution classes by using the estimated values. Particularly, the coarse DOA calculated by fully digital (FD) subarray, serves as the initial cluster center. The above process yields two methods called MM-IWF-GMaxCS and MM-IWF-GMinD. To further provide a higher-accuracy DOA estimation, a fusion network (fusionNet) is proposed to aggregate the inferred two-part true angles and thus generates two effective approaches called MM-fusionNet-GMaxCS and MM-fusionNet-GMinD. The simulation outcomes show the proposed four approaches can achieve the ideal DOA performance and the CRLB. Meanwhile, proposed MM-fusionNet-GMaxCS and MM-fusionNet-GMinD exhibit superior DOA performance compared to MM-IWF-GMaxCS and MM-IWF-GMinD, especially in extremely-low SNR range.


Robust Image Semantic Coding with Learnable CSI Fusion Masking over MIMO Fading Channels

arXiv.org Artificial Intelligence

Though achieving marvelous progress in various scenarios, existing semantic communication frameworks mainly consider single-input single-output Gaussian channels or Rayleigh fading channels, neglecting the widely-used multiple-input multiple-output (MIMO) channels, which hinders the application into practical systems. One common solution to combat MIMO fading is to utilize feedback MIMO channel state information (CSI). In this paper, we incorporate MIMO CSI into system designs from a new perspective and propose the learnable CSI fusion semantic communication (LCFSC) framework, where CSI is treated as side information by the semantic extractor to enhance the semantic coding. To avoid feature fusion due to abrupt combination of CSI with features, we present a non-invasive CSI fusion multi-head attention module inside the Swin Transformer. With the learned attention masking map determined by both source and channel states, more robust attention distribution could be generated. Furthermore, the percentage of mask elements could be flexibly adjusted by the learnable mask ratio, which is produced based on the conditional variational interference in an unsupervised manner. In this way, CSI-aware semantic coding is achieved through learnable CSI fusion masking. Experiment results testify the superiority of LCFSC over traditional schemes and state-of-the-art Swin Transformer-based semantic communication frameworks in MIMO fading channels.


Communication-Efficient Framework for Distributed Image Semantic Wireless Transmission

arXiv.org Artificial Intelligence

Multi-node communication, which refers to the interaction among multiple devices, has attracted lots of attention in many Internet-of-Things (IoT) scenarios. However, its huge amounts of data flows and inflexibility for task extension have triggered the urgent requirement of communication-efficient distributed data transmission frameworks. In this paper, inspired by the great superiorities on bandwidth reduction and task adaptation of semantic communications, we propose a federated learning-based semantic communication (FLSC) framework for multi-task distributed image transmission with IoT devices. Federated learning enables the design of independent semantic communication link of each user while further improves the semantic extraction and task performance through global aggregation. Each link in FLSC is composed of a hierarchical vision transformer (HVT)-based extractor and a task-adaptive translator for coarse-to-fine semantic extraction and meaning translation according to specific tasks. In order to extend the FLSC into more realistic conditions, we design a channel state information-based multiple-input multiple-output transmission module to combat channel fading and noise. Simulation results show that the coarse semantic information can deal with a range of image-level tasks. Moreover, especially in low signal-to-noise ratio and channel bandwidth ratio regimes, FLSC evidently outperforms the traditional scheme, e.g. about 10 peak signal-to-noise ratio gain in the 3 dB channel condition.