Why Report Failed Interactions With Robots?! Towards Vignette-based Interaction Quality

Axelsson, Agnes, Reimann, Merle, Cumbal, Ronald, Pelikan, Hannah, Lala, Divesh

Sep-9-2025–arXiv.org Artificial Intelligence

Abstract--Although the quality of human-robot interactions has improved with the advent of LLMs, there are still various factors that cause systems to be sub-optimal when compared to human-human interactions. The nature and criticality of failures are often dependent on the context of the interaction and so cannot be generalized across the wide range of scenarios and experiments which have been implemented in HRI research. In this work we propose the use of a technique overlooked in the field of HRI, ethnographic vignettes, to clearly highlight these failures, particularly those that are rarely documented. We describe the methodology behind the process of writing vignettes and create our own based on our personal experiences with failures in HRI systems. We emphasize the strength of vignettes as the ability to communicate failures from a multi-disciplinary perspective, promote transparency about the capabilities of robots, and document unexpected behaviours which would otherwise be omitted from research reports. We encourage the use of vignettes to augment existing interaction evaluation methods. High-quality dialogue with robots is a goal for many human-robot interaction (HRI) researchers [38]. Despite technological advancements, dialogues in HRI sometimes fail. In this paper, we propose vignette-writing as a method for reporting observations from failed interactions. The abilities of large language models (LLMs) to simulate human language have sparked an increased interest and optimism towards generating meaningful dialogues, despite their well-known shortcomings [6, 9, 24]. However, there is still much ground to cover towards flawless spoken interactions with robots [45]. One of the challenges that need to be addressed in order to move towards this goal lies in defining, describing and evaluating concrete interactions. In this paper, we propose that describing moments of failure in dialogues through ethnographic methods is one path to understanding, evaluating and defining human-robot interactions.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

Sep-9-2025

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - England (0.46)
- North America > United States (0.70)

Genre:
- Research Report
  - Experimental Study (0.87)
  - New Finding (0.87)

Industry:
- Consumer Products & Services > Food, Beverage, Tobacco & Cannabis
  - Beverages (0.46)
- Education (0.67)
- Health & Medicine (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.76)
  - Representation & Reasoning > Agents (1.00)
  - Robots > Humanoid Robots (0.78)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found