Does Knowledge Distillation Really Work? Samuel Stanton NYU Pavel Izmailov NYU Polina Kirichenko NYU Alexander A. Alemi Google Research Andrew Gordon Wilson NYU

Neural Information Processing Systems 

Large, deep networks can learn representations that generalize well.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found