r/MachineLearning - [P] Nearing BERT's accuracy on Sentiment Analysis with a model 56 times smaller by Knowledge Distillation

Open in new window