Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models

Open in new window