Multi-head Knowledge Distillation for Model Compression

Open in new window