Distance OP

Neural Information Processing Systems 

Conventional KD methods propose various designs to allow student model to imitate the teacher better. However, these MultiScale handcrafted KD designs heavily rely on expert knowledge and may be sub-optimal for various teacher-student pairs.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found