OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far?

Huang, Zhen, Wang, Zengzhi, Xia, Shijie, Liu, Pengfei

Jun-26-2024–arXiv.org Artificial Intelligence

In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)? We specifically focus on the most recently released models: Claude-3.5-Sonnet, Gemini-1.5-Pro, and GPT-4o. For the first time, we propose using an Olympic medal Table approach to rank AI models based on their comprehensive performance across various disciplines. Empirical results reveal: (1) Claude-3.5-Sonnet shows highly competitive overall performance over GPT-4o, even surpassing GPT-4o on a few subjects (i.e., Physics, Chemistry, and Biology). (2) Gemini-1.5-Pro and GPT-4V are ranked consecutively just behind GPT-4o and Claude-3.5-Sonnet, but with a clear performance gap between them. (3) The performance of AI models from the open-source community significantly lags behind these proprietary models. (4) The performance of these models on this benchmark has been less than satisfactory, indicating that we still have a long way to go before achieving superintelligence. We remain committed to continuously tracking and evaluating the performance of the latest powerful models on this benchmark (available at https://github.com/GAIR-NLP/OlympicArena).

chen, claude-3, gemini-1, (13 more...)

arXiv.org Artificial Intelligence

Jun-26-2024

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Middle East
  - Cyprus > Nicosia > Nicosia (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - Japan > Honshū
    - Tōhoku > Iwate Prefecture
      - Morioka (0.04)
    - Chūbu > Toyama Prefecture
      - Toyama (0.04)
  - China > Shanghai
    - Shanghai (0.04)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment > Sports > Olympic Games (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)