Goto

Collaborating Authors

 Malik, Ashish


Unlearning via Sparse Representations

arXiv.org Artificial Intelligence

Both methods, Unlearning via Activations and Unlearning via Examples, successfully demonstrated unlearning of the forget class while having a negligible effect on the models' performance on the retain set. Importantly, this is achieved without any form of training, retraining, or fine-tuning as is usually required by other methods. The retain set test accuracy remains more or less constant for all three datasets except for a few minor fluctuations. This is a result of the fact that due to localized and context-dependent sparse updates during the initial training of the model, discrete key-representations corresponding to different classes in the dataset are well separated from each other, an important prerequisite discussed in (Träuble et al., 2023). Hence, all the information about a class can be unlearned by forgetting only a subset of the forget class training data in the case of Unlearning via Examples, making it very data-efficient.