Causal Fuzzing for Verifying Machine Unlearning
Mazhar, Anna, Galhotra, Sainyam
–arXiv.org Artificial Intelligence
As machine learning models become increasingly embedded in decision-making systems, the ability to "unlearn" targeted data or features is crucial for enhancing model adaptability, fairness, and privacy in models which involves expensive training. To effectively guide machine unlearning, a thorough testing is essential. Existing methods for verification of machine unlearning provide limited insights, often failing in scenarios where the influence is indirect. In this work, we propose CAFÉ, a new causality based framework that unifies datapoint- and feature-level unlearning for verification of black-box ML models. CAFÉ evaluates both direct and indirect effects of unlearning targets through causal dependencies, providing actionable insights with fine-grained analysis. Our evaluation across five datasets and three model architectures demonstrates that CAFÉ successfully detects residual influence missed by baselines while maintaining computational efficiency.
arXiv.org Artificial Intelligence
Sep-23-2025
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.93)
- Law (0.93)
- Technology: