Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation

Open in new window