Reflection-Bench: probing AI intelligence with reflection

Li, Lingyu, Wang, Yixu, Zhao, Haiquan, Kong, Shuqi, Teng, Yan, Li, Chunbo, Wang, Yingchun

Oct-21-2024–arXiv.org Artificial Intelligence

The ability to adapt beliefs or behaviors in response to unexpected outcomes, reflection, is fundamental to intelligent systems' interaction with the world. From a cognitive science perspective, this serves as a core principle of intelligence applicable to both human and AI systems. To address the debate on the intelligence of large language models (LLMs), we propose Reflection-Bench, a comprehensive benchmark comprising 7 tasks spanning Figure 1: Reflection, a fundamental process of intelligence, core cognitive functions crucial for reflection, integrates various cognitive components. To including perception, memory, belief updating, achieve desired outcomes, an intelligent agent must decision-making, prediction, counterfactual predict the external world states and behavioral consequences thinking, and meta-reflection. We evaluate based on prior beliefs. Post-action, discrepancies the performances of 13 prominent LLMs between prediction and observation are perceived, such as OpenAI o1, GPT-4, Claude 3.5 Sonnet, prompting an update of prior belief.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-21-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Shanghai > Shanghai (0.04)
- North America > United States
  - Iowa (0.05)
  - Wisconsin (0.05)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Health & Medicine > Therapeutic Area
  - Neurology (0.88)
  - Psychiatry/Psychology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found