Students Parrot Their Teachers: Membership Inference on Model Distillation

Jagielski, Matthew, Nasr, Milad, Choquette-Choo, Christopher, Lee, Katherine, Carlini, Nicholas

arXiv.org Artificial Intelligence 

Model distillation (Hinton et al., 2015) is a common framework for knowledge transfer, where knowledge learned by a "teacher model" is transferred to a "student model" via the teacher's predictions. Distillation is helpful because the teacher's predictions are a more useful guide for the student model than hard labels; this phenomenon has been explained by the teacher's predictions containing some useful "dark knowledge". Variants of model distillation have been proposed for, e.g., model compression (Hinton et al., 2015; Ba & Caruana, 2014; Polino et al., 2018; Kim et al., 2018; Sun et al., 2019) or training more accurate models (Zagoruyko & Komodakis, 2016; Xie et al., 2020). Within the privacy-preserving machine learning community, distillation has been adapted to protect the privacy of a training dataset (Papernot et al., 2016; Tang et al., 2022; Shejwalkar & Houmansadr, 2021; Mazzone et al., 2022). Many of these approaches rely on the intuition that distilling the teacher model serves as a privacy barrier that protects the teacher's training data. Informally, restricting the student to learn only from the teacher's predictions is a form of data minimization, which should result in less private information being fed into, and memorized by, the student. This privacy barrier around the teacher also allows the teacher model to be trained with strong, non-private, training approaches, improving both the teacher model's and student model's accuracy. Because model distillation does not provide a rigorous privacy guarantee (such as those offered by differential privacy (Dwork et al., 2006)), in our work we evaluate the empirical privacy provided by these

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found