Distilling BERT -- How to achieve BERT performance using logistic regression

Open in new window