CogDevelop2K: Reversed Cognitive Development in Multimodal Large Language Models

Li, Yijiang, Gao, Qingying, Sun, Haoran, Lyu, Haiyun, Luo, Dezhi, Deng, Hokin

arXiv.org Artificial Intelligence 

Are Multi-modal Large Language Models (MLLMs) stochastic parrots? Do they genuinely understand? This paper aims to explore the core cognitive abilities that human intelligence builds upon to perceive, comprehend, and reason in MLLMs. To this end, we propose CogDevelop2K, a comprehensive benchmark that spans 12 sub-concepts from primitive knowledge like object permanence and boundary to more complex abilities like intentionality understanding, structured via the developmental trajectory of a human mind. We evaluate 46 MLLMs on our benchmarks. Surprisingly, we observe a reversed cognitive developmental trajectory compared to humans. Comprehensively, we further evaluate the influence of evaluation strategies and prompting techniques. Website with this $\href{https://growing-ai-like-a-child.github.io/}{link}$.