Efficient Transformer Knowledge Distillation: A Performance Review

Open in new window