Probing Mechanical Reasoning in Large Vision Language Models
Sun, Haoran, Gao, Qingying, Lyu, Haiyun, Luo, Dezhi, Deng, Hokin, Li, Yijiang
–arXiv.org Artificial Intelligence
Mechanical reasoning is a fundamental ability that sets human intelligence apart from other animal intelligence. Mechanical reasoning allows us to design tools, build bridges and canals, and construct houses which set the foundation of human civilization. Embedding machines with such ability is an important step towards building human-level artificial intelligence. Recently, Li et al. built CogDevelop2K, a data-intensive cognitive experiment benchmark for assaying the developmental trajectory of machine intelligence (Li et al., 2024). Here, to investigate mechanical reasoning in Vision Language Models, we leverage the MechBench of CogDevelop2K, which contains approximately 150 cognitive experiments, to test understanding of mechanical system stability, gears and pulley systems, seesaw-like systems and leverage principle, inertia and motion, and other fluid-related systems in Large Vision Language Models. We observe diverse yet consistent behaviors over these aspects in VLMs.
arXiv.org Artificial Intelligence
Sep-30-2024
- Country:
- North America > United States
- North Carolina (0.04)
- Michigan (0.04)
- Massachusetts (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- California > San Diego County
- San Diego (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: