Collaborating Authors


Language Models

Communications of the ACM

A transformer has strong language representation ability; a very large corpus contains rich language expressions (such unlabeled data can be easily obtained) and training large-scale deep learning models has become more efficient. Therefore, pre-trained language models can effectively represent a language's lexical, syntactic, and semantic features. Pre-trained language models, such as BERT and GPTs (GPT-1, GPT-2, and GPT-3), have become the core technologies of current NLP. Pre-trained language model applications have brought great success to NLP. "Fine-tuned" BERT has outperformed humans in terms of accuracy in language-understanding tasks, such as reading comprehension.8,17 "Fine-tuned" GPT-3 has also reached an astonishing level of fluency in text-generation tasks.3