Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT
At Hugging Face, we experienced first-hand the growing popularity of these models as our NLP library -- which encapsulates most of them -- got installed more than 400,000 times in just a few months. However, as these models were reaching a larger NLP community, an important and challenging question started to emerge. How should we put these monsters in production? How can we use such large models under low latency constraints? Do we need (costly) GPU servers to serve at scale?
Aug-29-2019, 20:46:26 GMT
- Country:
- Africa > Middle East > Morocco > Casablanca-Settat Region > Casablanca (0.05)
- Genre:
- Research Report (0.30)
- Technology: