12d286282e1be5431ea05262a21f415c-Paper-Conference.pdf

Neural Information Processing Systems 

Knowledge distillation (KD) has been widely used to improve the test accuracy of a "student" network, by training it to mimic the soft probabilities of a trained

Similar Docs  Excel Report  more

TitleSimilaritySource
None found