Distilling BERT -- How to achieve BERT performance using logistic regression
BERT is awesome, and it's everywhere. It looks like any NLP task can benefit from utilizing BERT. The authors showed that this is indeed the case, and from my experience, it works like magic. It's easy to use, works on a small amount of data and supports many different languages. It seems like there's no single reason not to use it everywhere.
Jun-21-2019, 18:19:15 GMT