How to Protect Models against Adversarial Unlearning?

Jasiorski, Patryk, Klonowski, Marek, Woźniak, Michał

Jul-16-2025–arXiv.org Artificial Intelligence

AI models need to be unlearned to fulfill the requirements of legal acts such as the AI Act or GDPR, and also because of the need to remove toxic content, debiasing, the impact of malicious instances, or changes in the data distribution structure in which a model works. Unfortunately, removing knowledge may cause undesirable side effects, such as a deterioration in model performance. In this paper, we investigate the problem of adversarial unlearning, where a malicious party intentionally sends unlearn requests to deteriorate the model's performance maximally. We show that this phenomenon and the adversary's capabilities depend on many factors, primarily on the backbone model itself and strategy/limitations in selecting data to be unlearned. The main result of this work is a new method of protecting model performance from these side effects, both in the case of unlearned behavior resulting from spontaneous processes and adversary actions.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jul-16-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.68)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found