Characterizing Language Use in a Collaborative Situated Game
Tomlin, Nicholas, Zhou, Naitian, Fleisig, Eve, Chen, Liangyuan, Wright, Téa, Vinh, Lauren, Ma, Laura X., Eisape, Seun, French, Ellie, Du, Tingting, Zhang, Tianjiao, Koller, Alexander, Suhr, Alane
–arXiv.org Artificial Intelligence
Cooperative video games, where multiple participants must coordinate by communicating and reasoning under uncertainty in complex environments, yield a rich source of language data. We collect the Portal Dialogue Corpus: a corpus of 11.5 hours of spoken human dialogue in the co-op mode of the popular Portal 2 virtual puzzle game, comprising 24.5K total utterances. We analyze player language and behavior, identifying a number of linguistic phenomena that rarely appear in most existing chitchat or task-oriented dialogue corpora, including complex spatial reference, clarification and repair, and ad-hoc convention formation. To support future analyses of language use in complex, situated, collaborative problem-solving scenarios, we publicly release the corpus, which comprises player videos, audio, transcripts, game state data, and both manual and automatic annotations of language data.
arXiv.org Artificial Intelligence
Dec-8-2025
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- Germany > Saarland (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Illinois > Cook County
- Asia > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Leisure & Entertainment > Games > Computer Games (1.00)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning (1.00)
- Natural Language > Chatbot (0.93)
- Representation & Reasoning (1.00)
- Communications (1.00)
- Artificial Intelligence
- Information Technology