SceneGram: Conceptualizing and Describing Tangrams in Scene Context

Jun-16-2025–arXiv.org Artificial Intelligence

Research on reference and naming suggests that humans can come up with very different ways of conceptualizing and referring to the same object, e.g. the same abstract tangram shape can be a "crab", "sink" or "space ship". Another common assumption in cognitive science is that scene context fundamentally shapes our visual perception of objects and conceptual expectations. This paper contributes SceneGram, a dataset of human references to tangram shapes placed in different scene contexts, allowing for systematic analyses of the effect of scene context on conceptualization. Based on this data, we analyze references to tangram shapes generated by multimodal LLMs, showing that these models do not account for the richness and variability of conceptualizations found in human references.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Jun-16-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - China > Beijing
    - Beijing (0.04)
  - Japan
    - Honshū > Kantō
      - Tokyo Metropolis Prefecture > Tokyo (0.14)
    - Kyūshū & Okinawa > Kyūshū
      - Miyazaki Prefecture > Miyazaki (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
- Europe
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Germany (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
- North America
  - Canada (0.04)
  - Dominican Republic (0.04)
  - United States
    - Florida > Miami-Dade County
      - Miami (0.04)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
- Oceania > Australia
  - New South Wales > Sydney (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science (1.00)
  - Machine Learning (0.93)
  - Natural Language
    - Large Language Model (0.66)
    - Text Processing (1.00)
  - Representation & Reasoning (1.00)
  - Vision (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found