This New BERT Is Way Faster & Smaller Than The Original

Nov-3-2020, 07:15:32 GMT–#artificialintelligence

Recently, the researchers at Amazon introduced an optimal subset of the popular BERT architecture for neural architecture search. This smaller version of BERT is known as BORT and is able to be pre-trained in 288 GPU hours, which is 1.2% of the time required to pre-train the highest-performing BERT parametric architectural variant, RoBERTa-large. Since its inception, BERT has achieved several groundbreaking tasks in the field of natural language processing (NLP) and natural language understanding (NLU). It has made a resounding impact in the area of language modelling, as well. However, several times, the usability of BERT has been considered an issue for various serious concerns, such as its larger size, slow inference time, complex pre-training process, among others.

artificial intelligence, machine learning, natural language, (15 more...)

#artificialintelligence

Nov-3-2020, 07:15:32 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks (0.52)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found