SCU: An Efficient Machine Unlearning Scheme for Deep Learning Enabled Semantic Communications

Wang, Weiqi, Tian, Zhiyi, Zhang, Chenhan, Yu, Shui

Feb-27-2025–arXiv.org Artificial Intelligence

--Deep learning (DL) enabled semantic communications leverage DL to train encoders and decoders (codecs) to extract and recover semantic information. However, most semantic training datasets contain personal private information. Such concerns call for enormous requirements for specified data erasure from semantic codecs when previous users hope to move their data from the semantic system. Existing machine unlearning solutions remove data contribution from trained models, yet usually in supervised sole model scenarios. These methods are infeasible in semantic communications that often need to jointly train unsupervised encoders and decoders. In this paper, we investigate the unlearning problem in DL-enabled semantic communications and propose a semantic communication unlearning (SCU) scheme to tackle the problem. SCU includes two key components. Firstly, we customize the joint unlearning method for semantic codecs, including the encoder and decoder, by minimizing mutual information between the learned semantic representation and the erased samples. Secondly, to compensate for semantic model utility degradation caused by unlearning, we propose a contrastive compensation method, which considers the erased data as the negative samples and the remaining data as the positive samples to retrain the unlearned semantic models con-trastively. Theoretical analysis and extensive experimental results on three representative datasets demonstrate the effectiveness and efficiency of our proposed methods. EMANTIC communication has attracted significant attention recently. It is regarded as a significant advancement beyond the Shannon paradigm, as semantic communication focuses on transmitting the underlying semantic information from the source, rather than ensuring the accurate reception of each individual symbol or bit irrespective of its meaning [1, 2]. With the burgeoning advancement of deep learning (DL), researchers found that employing DL models as the encoder and decoder greatly improves semantic transmission efficiency and reliability [3, 4], called DL-enabled semantic communications. However, to train these DL semantic encoders and decoders, transmitters and receivers must first collect the training datasets from huge amounts of human activities from users [1], which contain rich personal privacy information. This paper was supported in part by Australia ARC LP220100453, ARC DP200101374, and ARC DP240100955. W . Wang, Z. Tian and S. Y u are with the School of Computer Science, University of Technology Sydney, Australia. In healthcare scenarios, the server needs to collect users' sensitive information, such as blood pressure, heart rate, etc, for SC model training. Users also benefit from the downstream applications when the SC models are well-trained.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Feb-27-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.68)
- Oceania > Australia
  - New South Wales > Sydney (0.24)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Text Processing (1.00)
  - Representation & Reasoning (1.00)