Knowledge Distillation from A Stronger Teacher Tao Huang 1,2 Shan You 1 Fei Wang 3 Chen Qian
–Neural Information Processing Systems
We empirically find that the discrepancy of predictions between the student and a stronger teacher may tend to be fairly severer.
Neural Information Processing Systems
Aug-19-2025, 09:01:21 GMT