Probing Perceptual Constancy in Large Vision Language Models
Sun, Haoran, Yu, Suyang, Li, Yijiang, Gao, Qingying, Lyu, Haiyun, Deng, Hokin, Luo, Dezhi
–arXiv.org Artificial Intelligence
Perceptual constancy is the ability to maintain stable perceptions of objects despite changes in sensory input, such as variations in distance, angle, or lighting. This ability is crucial for recognizing visual information in a dynamic world, making it essential for Vision-Language Models (VLMs). However, whether VLMs are currently and theoretically capable of mastering this ability remains underexplored. In this study, we evaluated 33 VLMs using 253 experiments across three domains: color, size, and shape constancy. The experiments included single-image and video adaptations of classic cognitive tasks, along with novel tasks in in-the-wild conditions, to evaluate the models' recognition of object properties under varying conditions. We found significant variability in VLM performance, with models performance in shape constancy clearly dissociated from that of color and size constancy.
arXiv.org Artificial Intelligence
Feb-14-2025
- Country:
- North America > United States (1.00)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Government > Military (0.46)
- Health & Medicine (0.46)
- Transportation (0.47)
- Technology: