Goto

Collaborating Authors

 robot failure


Training Models to Detect Successive Robot Errors from Human Reactions

Liu, Shannon, Parreira, Maria Teresa, Ju, Wendy

arXiv.org Artificial Intelligence

As robots become more integrated into society, detecting robot errors is essential for effective human-robot interaction (HRI). When a robot fails repeatedly, how can it know when to change its behavior? Humans naturally respond to robot errors through verbal and nonverbal cues that intensify over successive failures-from confusion and subtle speech changes to visible frustration and impatience. While prior work shows that human reactions can indicate robot failures, few studies examine how these evolving responses reveal successive failures. This research uses machine learning to recognize stages of robot failure from human reactions. In a study with 26 participants interacting with a robot that made repeated conversational errors, behavioral features were extracted from video data to train models for individual users. The best model achieved 93.5% accuracy for detecting errors and 84.1% for classifying successive failures. Modeling the progression of human reactions enhances error detection and understanding of repeated interaction breakdowns in HRI.


Expectations, Explanations, and Embodiment: Attempts at Robot Failure Recovery

Yadollahi, Elmira, Dogan, Fethiye Irmak, Zhang, Yujing, Nogueira, Beatriz, Guerreiro, Tiago, Tzedek, Shelly Levy, Leite, Iolanda

arXiv.org Artificial Intelligence

Expectations critically shape how people form judgments about robots, influencing whether they view failures as minor technical glitches or deal-breaking flaws. This work explores how high and low expectations, induced through brief video priming, affect user perceptions of robot failures and the utility of explanations in HRI. We conducted two online studies ( N = 600 total participants); each replicated two robots with different embodiments, Furhat and Pepper. In our first study, grounded in expectation theory, participants were divided into two groups, one primed with positive and the other with negative expectations regarding the robot's performance, establishing distinct expectation frameworks. This validation study aimed to verify whether the videos could reliably establish low and high-expectation profiles. In the second study, participants were primed using the validated videos and then viewed a new scenario in which the robot failed at a task. Half viewed a version where the robot explained its failure, while the other half received no explanation. We found that explanations significantly improved user perceptions of Furhat, especially when participants were primed to have lower expectations. Explanations boosted satisfaction and enhanced the robot's perceived expressiveness, indicating that effectively communicat-Authors contributed equally. By contrast, Pepper's explanations produced minimal impact on user attitudes, suggesting that a robot's embodiment and style of interaction could determine whether explanations can successfully offset negative impressions. Together, these findings underscore the need to consider users' expectations when tailoring explanation strategies in HRI. When expectations are initially low, a cogent explanation can make the difference between dismissing a failure and appreciating the robot's transparency and effort to communicate. Keywords: Expectations, Explanations, Explainability, Human-Robot Interaction, Priming 1. Introduction When robots operate in human environments, user expectations play a crucial role in shaping human-robot interaction (HRI) (Lohse, 2009; Horstmann and Kr amer, 2020; Dogan et al., 2025). However, there is often a mismatch between these expectations and the actual capabilities of social robots (Ros en et al., 2022), which can lead to disappointment and, consequently, diminish the quality of interactions (Olson et al., 1996; Kruglanski and Sleeth-Keppler, 2007). For instance, a user might expect robots to function as proactive and autonomous assistants, yet when robots make mistakes due to their limited abilities, this mismatch can undermine the robot's perceived trustworthiness and competence (Salem et al., 2015; Cha et al., 2015).


Real-Time Detection of Robot Failures Using Gaze Dynamics in Collaborative Tasks

Tabatabaei, Ramtin, Kostakos, Vassilis, Johal, Wafa

arXiv.org Artificial Intelligence

Detecting robot failures during collaborative tasks is crucial for maintaining trust in human-robot interactions. This study investigates user gaze behaviour as an indicator of robot failures, utilising machine learning models to distinguish between non-failure and two types of failures: executional and decisional. Eye-tracking data were collected from 26 participants collaborating with a robot on Tangram puzzle-solving tasks. Gaze metrics, such as average gaze shift rates and the probability of gazing at specific areas of interest, were used to train machine learning classifiers, including Random Forest, AdaBoost, XGBoost, SVM, and CatBoost. The results show that Random Forest achieved 90% accuracy for detecting executional failures and 80% for decisional failures using the first 5 seconds of failure data. Real-time failure detection was evaluated by segmenting gaze data into intervals of 3, 5, and 10 seconds. These findings highlight the potential of gaze dynamics for real-time error detection in human-robot collaboration.


Gazing at Failure: Investigating Human Gaze in Response to Robot Failure in Collaborative Tasks

Tabatabaei, Ramtin, Kostakos, Vassilis, Johal, Wafa

arXiv.org Artificial Intelligence

Robots are prone to making errors, which can negatively impact their credibility as teammates during collaborative tasks with human users. Detecting and recovering from these failures is crucial for maintaining effective level of trust from users. However, robots may fail without being aware of it. One way to detect such failures could be by analysing humans' non-verbal behaviours and reactions to failures. This study investigates how human gaze dynamics can signal a robot's failure and examines how different types of failures affect people's perception of robot. We conducted a user study with 27 participants collaborating with a robotic mobile manipulator to solve tangram puzzles. The robot was programmed to experience two types of failures -- executional and decisional -- occurring either at the beginning or end of the task, with or without acknowledgement of the failure. Our findings reveal that the type and timing of the robot's failure significantly affect participants' gaze behaviour and perception of the robot. Specifically, executional failures led to more gaze shifts and increased focus on the robot, while decisional failures resulted in lower entropy in gaze transitions among areas of interest, particularly when the failure occurred at the end of the task. These results highlight that gaze can serve as a reliable indicator of robot failures and their types, and could also be used to predict the appropriate recovery actions.


REFLEX Dataset: A Multimodal Dataset of Human Reactions to Robot Failures and Explanations

Khanna, Parag, Naoum, Andreas, Yadollahi, Elmira, Björkman, Mårten, Smith, Christian

arXiv.org Artificial Intelligence

--This work presents REFLEX: Robotic Explanations to FaiLures and Human EXpressions, a comprehensive mul-timodal dataset capturing human reactions to robot failures and subsequent explanations in collaborative settings. It aims to facilitate research into human-robot interaction dynamics, addressing the need to study reactions to both initial failures and explanations, as well as the evolution of these reactions in long-term interactions. By providing rich, annotated data on human responses to different types of failures, explanation levels, and explanation varying strategies, the dataset contributes to the development of more robust, adaptive, and satisfying robotic systems capable of maintaining positive relationships with human collaborators, even during challenges like repeated failures. I NTRODUCTION As robots become increasingly integrated into our everyday lives, from homes and workplaces to public spaces, the need to understand and improve human-robot interaction (HRI) has never been more critical. Despite significant advancements in robotics, they are still prone to failures, ranging from minor glitches to serious malfunctions.


Multimodal Coherent Explanation Generation of Robot Failures

Pramanick, Pradip, Rossi, Silvia

arXiv.org Artificial Intelligence

The explainability of a robot's actions is crucial to its acceptance in social spaces. Explaining why a robot fails to complete a given task is particularly important for non-expert users to be aware of the robot's capabilities and limitations. So far, research on explaining robot failures has only considered generating textual explanations, even though several studies have shown the benefits of multimodal ones. However, a simple combination of multiple modalities may lead to semantic incoherence between the information across different modalities - a problem that is not well-studied. An incoherent multimodal explanation can be difficult to understand, and it may even become inconsistent with what the robot and the human observe and how they perform reasoning with the observations. Such inconsistencies may lead to wrong conclusions about the robot's capabilities. In this paper, we introduce an approach to generate coherent multimodal explanations by checking the logical coherence of explanations from different modalities, followed by refinements as required. We propose a classification approach for coherence assessment, where we evaluate if an explanation logically follows another. Our experiments suggest that fine-tuning a neural network that was pre-trained to recognize textual entailment, performs well for coherence assessment of multimodal explanations. Code & data: https://pradippramanick.github.io/coherent-explain/.


ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions

Spitale, Micol, Parreira, Maria Teresa, Stiber, Maia, Axelsson, Minja, Kara, Neval, Kankariya, Garima, Huang, Chien-Ming, Jung, Malte, Ju, Wendy, Gunes, Hatice

arXiv.org Artificial Intelligence

Despite the recent advancements in robotics and machine learning (ML), the deployment of autonomous robots in our everyday lives is still an open challenge. This is due to multiple reasons among which are their frequent mistakes, such as interrupting people or having delayed responses, as well as their limited ability to understand human speech, i.e., failure in tasks like transcribing speech to text. These mistakes may disrupt interactions and negatively influence human perception of these robots. To address this problem, robots need to have the ability to detect human-robot interaction (HRI) failures. The ERR@HRI 2024 challenge tackles this by offering a benchmark multimodal dataset of robot failures during human-robot interactions (HRI), encouraging researchers to develop and benchmark multimodal machine learning models to detect these failures. We created a dataset featuring multimodal non-verbal interaction data, including facial, speech, and pose features from video clips of interactions with a robotic coach, annotated with labels indicating the presence or absence of robot mistakes, user awkwardness, and interaction ruptures, allowing for the training and evaluation of predictive models. Challenge participants have been invited to submit their multimodal ML models for detection of robot errors and to be evaluated against various performance metrics such as accuracy, precision, recall, F1 score, with and without a margin of error reflecting the time-sensitivity of these metrics. The results of this challenge will help the research field in better understanding the robot failures in human-robot interactions and designing autonomous robots that can mitigate their own errors after successfully detecting them.


Effects of Explanation Strategies to Resolve Failures in Human-Robot Collaboration

Khanna, Parag, Yadollahi, Elmira, Björkman, Mårten, Leite, Iolanda, Smith, Christian

arXiv.org Artificial Intelligence

Despite significant improvements in robot capabilities, they are likely to fail in human-robot collaborative tasks due to high unpredictability in human environments and varying human expectations. In this work, we explore the role of explanation of failures by a robot in a human-robot collaborative task. We present a user study incorporating common failures in collaborative tasks with human assistance to resolve the failure. In the study, a robot and a human work together to fill a shelf with objects. Upon encountering a failure, the robot explains the failure and the resolution to overcome the failure, either through handovers or humans completing the task. The study is conducted using different levels of robotic explanation based on the failure action, failure cause, and action history, and different strategies in providing the explanation over the course of repeated interaction. Our results show that the success in resolving the failures is not only a function of the level of explanation but also the type of failures. Furthermore, while novice users rate the robot higher overall in terms of their satisfaction with the explanation, their satisfaction is not only a function of the robot's explanation level at a certain round but also the prior information they received from the robot.


Utilising Explanations to Mitigate Robot Conversational Failures

Kontogiorgos, Dimosthenis

arXiv.org Artificial Intelligence

This paper presents an overview of robot failure detection work from HRI and adjacent fields using failures as an opportunity to examine robot explanation behaviours. As humanoid robots remain experimental tools in the early 2020s, interactions with robots are situated overwhelmingly in controlled environments, typically studying various interactional phenomena. Such interactions suffer from real-world and large-scale experimentation and tend to ignore the 'imperfectness' of the everyday user. Robot explanations can be used to approach and mitigate failures, by expressing robot legibility and incapability, and within the perspective of common-ground. In this paper, I discuss how failures present opportunities for explanations in interactive conversational robots and what the potentials are for the intersection of HRI and explainability research.


Scratch Team of Single-Rotor Robots and Decentralized Cooperative Transportation with Robot Failure

Oishi, Koshi, Amano, Yasushi, Tomohiko, Jimbo

arXiv.org Artificial Intelligence

Achieving cooperative transportation by teams of aerial robots has been attracting attention owing to its flexibility with respect to payloads and robustness against failures. In this paper, we propose a flexible decentralized controller for the number of robots and the shapes of payloads in a cooperative transport task using multiple single-rotor robots. Our controller is robust to mass and center of mass fluctuations and robot failures. Moreover, asymptotic stability against dynamics errors is guaranteed. Additionally, the controller supports heterogeneous single-rotor robots. Thus, robots with different specifications and deterioration can be effectively utilized for cooperative transportation. In particular, this performance is effective for robot reuse. To achieve the aforementioned performance, the controller consists of a parallel structure comprising two controllers: a feedback controller, which renders the system strictly positive real, and nonlinear controller, which renders the object asymptotic to the target. First, we confirm cooperative transportation using 8 and 10 robots for two shapes via numerical simulation. Subsequently, the cooperative transportation of a rectangle payload (with a weight of approximately 3 kg and maximum length of 1.6 m) is demonstrated using a robot team consisting of three types of robots, even under robot failure and center of mass fluctuation.