DeepMind says its new language model can beat others 25 times its size

Dec-8-2021, 18:22:09 GMT–MIT Technology Review

Called RETRO (for "Retrieval-Enhanced Transformer"), the AI matches the performance of neural networks 25 times its size, cutting the time and cost needed to train very large models. The researchers also claim that the database makes it easier to analyze what the AI has learned, which could help with filtering out bias and toxic language. "Being able to look things up on the fly instead of having to memorize everything can often be useful, in the same way as it is for humans," says Jack Rae at DeepMind, who leads the firm's research in large language models. Language models generate text by predicting what words come next in a sentence or conversation. The larger a model, the more information about the world it can learn during training, which makes its predictions better.

deepmind, language model, neural network, (6 more...)

MIT Technology Review

Dec-8-2021, 18:22:09 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found