Coreference as an indicator of context scope in multimodal narrative
Ilinykh, Nikolai, Lappin, Shalom, Sayeed, Asad, Loáiciga, Sharid
–arXiv.org Artificial Intelligence
We demonstrate that large multimodal language models differ substantially from humans in the distribution of coreferential expressions in a visual storytelling task. We introduce a number of metrics to quantify the characteristics of coreferential patterns in both human- and machine-written texts. Humans distribute coreferential expressions in a way that maintains consistency across texts and images, interleaving references to different entities in a highly varied way. Machines are less able to track mixed references, despite achieving perceived improvements in generation quality.
arXiv.org Artificial Intelligence
Mar-7-2025
- Country:
- North America
- United States
- Pennsylvania (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- California > San Diego County
- San Diego (0.04)
- Canada > Ontario
- Toronto (0.04)
- United States
- Europe
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Sweden > Vaestra Goetaland
- Asia
- North Korea > Hwanghae-namdo
- Haeju (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.14)
- North Korea > Hwanghae-namdo
- North America
- Genre:
- Research Report > Experimental Study (0.46)
- Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.70)
- Media (0.46)
- Health & Medicine (0.46)
- Technology: