KD-Zero: Evolving Knowledge Distiller for Any Teacher-Student Pairs

Neural Information Processing Systems 

Knowledge distillation (KD) has emerged as an effective technique for compressing models that can enhance the lightweight model. Conventional KD methods propose various designs to allow student model to imitate the teacher better.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found