DeepMind says its new language model can beat others 25 times its size
Called RETRO (for "Retrieval-Enhanced Transformer"), the AI matches the performance of neural networks 25 times its size, cutting the time and cost needed to train very large models. The researchers also claim that the database makes it easier to analyze what the AI has learned, which could help with filtering out bias and toxic language. "Being able to look things up on the fly instead of having to memorize everything can often be useful, in the same way as it is for humans," says Jack Rae at DeepMind, who leads the firm's research in large language models. Language models generate text by predicting what words come next in a sentence or conversation. The larger a model, the more information about the world it can learn during training, which makes its predictions better.
Dec-8-2021, 18:22:09 GMT
- Technology: