Contrasting Cognitive Styles in Vision-Language Models: Holistic Attention in Japanese Versus Analytical Focus in English

Sabir, Ahmed, Gasper, Azinovič, Loem, Mengsay, Sharma, Rajesh

Jul-2-2025–arXiv.org Artificial Intelligence

Cross-cultural research in perception and cognition has shown that individuals from different cultural backgrounds process visual information in distinct ways. East Asians, for example, tend to adopt a holistic perspective, attending to contextual relationships, whereas Westerners often employ an analytical approach, focusing on individual objects and their attributes. In this study, we investigate whether Vision-Language Models (VLMs) trained predominantly on different languages, specifically Japanese and English, exhibit similar culturally grounded attentional patterns. Using comparative analysis of image descriptions, we examine whether these models reflect differences in holistic versus analytic tendencies. Our findings suggest that VLMs not only internalize the structural properties of language but also reproduce cultural behaviors embedded in the training data, indicating that cultural cognition may implicitly shape model outputs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jul-2-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe
  - Germany > Berlin (0.04)
  - Slovenia > Central Slovenia
    - Municipality of Ljubljana > Ljubljana (0.04)
  - Estonia > Tartu County
    - Tartu (0.04)
- Asia
  - Japan (0.04)
  - India (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Transportation > Ground (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language > Large Language Model (0.61)
  - Machine Learning > Neural Networks (0.42)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found