AI Models Are Getting Smarter. New Tests Are Racing to Catch Up

Dec-24-2024, 15:05:49 GMT–TIME - Tech

Despite their expertise, AI developers don't always know what their most advanced systems are capable of--at least, not at first. To find out, systems are subjected to a range of tests--often called evaluations, or'evals'--designed to tease out their limits. But due to rapid progress in the field, today's systems regularly achieve top scores on many popular tests, including SATs and the U.S. bar exam, making it harder to judge just how quickly they are improving. A new set of much more challenging evals has emerged in response, created by companies, nonprofits, and governments. Yet even on the most advanced evals, AI systems are making astonishing progress.

ai system, benchmark, eval, (16 more...)

TIME - Tech

Dec-24-2024, 15:05:49 GMT

News, Technology Web Page

Add feedback

Country:
- North America > United States (0.14)

Industry:
- Government (0.71)
- Education (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.72)
  - Machine Learning > Neural Networks
    - Deep Learning (0.50)