Leading AI models fail new test of artificial general intelligence

New Scientist 

The most sophisticated AI models in existence today have scored poorly on a new benchmark designed to measure their progress towards artificial general intelligence (AGI) – and brute-force computing power won't be enough to improve, as evaluators are now taking into account the cost of running the model. There are many competing definitions of AGI, but it is generally taken to refer to an AI that can perform any cognitive task that humans can do. To measure this, the ARC Prize Foundation previously launched a test of reasoning abilities called ARC-AGI-1. Last December, OpenAI announced that its o3 model had scored highly on the test, leading some to ask if the company was close to achieving AGI. But now a new test, ARC-AGI-2, has raised the bar.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found