Assessing Consciousness-Related Behaviors in Large Language Models Using the Maze Test
Pimenta, Rui A., Schlippe, Tim, Schaaff, Kristina
–arXiv.org Artificial Intelligence
--We investigate consciousness-like behaviors in Large Language Models (LLMs) using the Maze T est, challenging models to navigate mazes from a first-person perspective. After synthesizing consciousness theories into 13 essential characteristics, we evaluated 12 leading LLMs across zero-shot, one-shot, and few-shot learning scenarios. Results showed reasoning-capable LLMs consistently outperforming standard versions, with Gemini 2.0 Pro achieving 52.9% Complete Path Accuracy and DeepSeek-R1 reaching 80.5% Partial Path Accuracy . The gap between these metrics indicates LLMs struggle to maintain coherent self-models throughout solutions--a fundamental consciousness aspect. While LLMs show progress in consciousness-related behaviors through reasoning mechanisms, they lack the integrated, persistent self-awareness characteristic of consciousness. The emergence of human-like capabilities in AI has been debated since the field's inception in the 1950s [1], [2]. An early case was ELIZA [3], a chatbot simulating a therapist. Though based on pattern matching, its responses were so convincing that Weizenbaum's secretary requested privacy for a "real conversation"--showing how humans can mistakenly perceive consciousness in even the simplest AI systems.
arXiv.org Artificial Intelligence
Aug-26-2025
- Country:
- Europe
- Germany (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Portugal > Braga
- Braga (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- North America > United States
- California > Santa Clara County > Palo Alto (0.04)
- Europe
- Genre:
- Research Report > New Finding (0.86)
- Industry:
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Technology: