The Download: the mystery of LLMs, and the EU's Big Tech crackdown

Mar-4-2024, 13:10:00 GMT–MIT Technology Review

Two years ago, Yuri Burda and Harri Edwards, researchers at OpenAI, were trying to find out what it would take to get a large language model to do basic arithmetic. The models memorized the sums they saw but failed to solve new ones. By accident, Burda and Edwards left some of their experiments running for days rather than hours. The models were shown the example sums over and over again, and eventually they learned to add two numbers--it had just taken a lot more time than anybody thought it should. In certain cases, models could seemingly fail to learn a task and then all of a sudden just get it, as if a lightbulb had switched on, a behavior the researchers called grokking.

large language model, machine learning, natural language, (8 more...)

MIT Technology Review

Mar-4-2024, 13:10:00 GMT

News Web Page

Add feedback

Industry:
- Leisure & Entertainment > Games > Chess (0.37)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.40)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found