Reflection-Bench: probing AI intelligence with reflection
Li, Lingyu, Wang, Yixu, Zhao, Haiquan, Kong, Shuqi, Teng, Yan, Li, Chunbo, Wang, Yingchun
–arXiv.org Artificial Intelligence
The ability to adapt beliefs or behaviors in response to unexpected outcomes, reflection, is fundamental to intelligent systems' interaction with the world. From a cognitive science perspective, this serves as a core principle of intelligence applicable to both human and AI systems. To address the debate on the intelligence of large language models (LLMs), we propose Reflection-Bench, a comprehensive benchmark comprising 7 tasks spanning Figure 1: Reflection, a fundamental process of intelligence, core cognitive functions crucial for reflection, integrates various cognitive components. To including perception, memory, belief updating, achieve desired outcomes, an intelligent agent must decision-making, prediction, counterfactual predict the external world states and behavioral consequences thinking, and meta-reflection. We evaluate based on prior beliefs. Post-action, discrepancies the performances of 13 prominent LLMs between prediction and observation are perceived, such as OpenAI o1, GPT-4, Claude 3.5 Sonnet, prompting an update of prior belief.
arXiv.org Artificial Intelligence
Oct-21-2024
- Country:
- North America > United States (0.48)
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Health & Medicine > Therapeutic Area
- Neurology (0.88)
- Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area
- Technology: