Goto

Collaborating Authors

 Machine Translation



Zero-shot Knowledge Transfer via Adversarial Belief Matching

Paul Micaelli, Amos J. Storkey

Neural Information Processing Systems

However,duetogrowing dataset sizes and stricter privacy regulations, it is increasingly common not to have access to the data that was used to train the teacher. We propose a novel method which trains a student to match the predictions of its teacher without using anydata ormetadata. Weachievethisbytraining anadversarial generator to search for images on which the student poorly matches the teacher, and then using them to train the student.






When does label smoothing help?

Rafael Müller, Simon Kornblith, Geoffrey E. Hinton

Neural Information Processing Systems

To explain these observations, we visualize how label smoothing changes therepresentations learned bythepenultimate layerofthenetwork. We show that label smoothing encourages the representations of training examples from thesame class togroup intight clusters. This results inloss ofinformation inthe logits about resemblances between instances ofdifferent classes, which isnecessary for distillation, but does not hurt generalization or calibration of the model'spredictions.