A version of the BERT language model that's 20 times as fast

Dec-4-2020, 08:05:55 GMT–#artificialintelligence

In natural-language understanding (NLU), the Transformer-based BERT language model is king. Its high performance on multiple tasks has strongly influenced contemporary NLU research. On the other hand, it is a relatively big and slow model, which makes it unsuitable for some applications. Multiple efforts have been made to compress the BERT architecture, but the choice of architectural parameters (the number of layers, the number of processing nodes per layer, and so on) has been somewhat arbitrary, and the resulting models are rarely much better than the original at optimizing the balance between the model's size, speed, and error rate. A few weeks ago, we released part of the code for Bort, a highly optimized language model (LM) extracted from the BERT architecture through a combination of two rigorous algorithmic techniques especially designed for neural-network compression.

algorithm, architectural parameter, bert, (15 more...)

#artificialintelligence

Dec-4-2020, 08:05:55 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found