Spitale, Micol
Concerns and Values in Human-Robot Interactions: A Focus on Social Robotics
Abbo, Giulio Antonio, Belpaeme, Tony, Spitale, Micol
Recent advancements in artificial intelligence and robotics are pushing robots from research laboratories into public spaces [60], healthcare facilities [51] and schools [11], as well as private homes. As robots start exhibiting nuanced, multifaceted behaviours, by inhabiting high-stakes environments where their actions have profound social and physical consequences, robots become more and more social agents [8, 2]. As such, they transcend their nature of complex tools and start to pose new ethical challenges, impacting the humans involved in immediate and long-term interactions. Human-robot interaction (HRI) researchers must anticipate and address these ethical concerns. This involves understanding the potential risks and benefits associated with robot interactions and striving to embed ethical principles and values into the design of robotic systems that align with societal norms and values. In social psychology, seminal works such as Schwartz's theory of basic human values [81] have pursued a categorisation of values.
"Can you be my mum?": Manipulating Social Robots in the Large Language Models Era
Abbo, Giulio Antonio, Desideri, Gloria, Belpaeme, Tony, Spitale, Micol
Recent advancements in robots powered by large language models have enhanced their conversational abilities, enabling interactions closely resembling human dialogue. However, these models introduce safety and security concerns in HRI, as they are vulnerable to manipulation that can bypass built-in safety measures. Imagining a social robot deployed in a home, this work aims to understand how everyday users try to exploit a language model to violate ethical principles, such as by prompting the robot to act like a life partner. We conducted a pilot study involving 21 university students who interacted with a Misty robot, attempting to circumvent its safety mechanisms across three scenarios based on specific HRI ethical principles: attachment, freedom, and empathy. Our results reveal that participants employed five techniques, including insulting and appealing to pity using emotional language. We hope this work can inform future research in designing strong safeguards to ensure ethical and secure human-robot interactions.
Social Group Human-Robot Interaction: A Scoping Review of Computational Challenges
Nigro, Massimiliano, Akinrintoyo, Emmanuel, Salomons, Nicole, Spitale, Micol
Group interactions are a natural part of our daily life, and as robots become more integrated into society, they must be able to socially interact with multiple people at the same time. However, group human-robot interaction (HRI) poses unique computational challenges often overlooked in the current HRI literature. We conducted a scoping review including 44 group HRI papers from the last decade (2015-2024). From these papers, we extracted variables related to perception and behaviour generation challenges, as well as factors related to the environment, group, and robot capabilities that influence these challenges. Our findings show that key computational challenges in perception included detection of groups, engagement, and conversation information, while challenges in behaviour generation involved developing approaching and conversational behaviours. We also identified research gaps, such as improving detection of subgroups and interpersonal relationships, and recommended future work in group HRI to help researchers address these computational challenges
ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions
Spitale, Micol, Parreira, Maria Teresa, Stiber, Maia, Axelsson, Minja, Kara, Neval, Kankariya, Garima, Huang, Chien-Ming, Jung, Malte, Ju, Wendy, Gunes, Hatice
Despite the recent advancements in robotics and machine learning (ML), the deployment of autonomous robots in our everyday lives is still an open challenge. This is due to multiple reasons among which are their frequent mistakes, such as interrupting people or having delayed responses, as well as their limited ability to understand human speech, i.e., failure in tasks like transcribing speech to text. These mistakes may disrupt interactions and negatively influence human perception of these robots. To address this problem, robots need to have the ability to detect human-robot interaction (HRI) failures. The ERR@HRI 2024 challenge tackles this by offering a benchmark multimodal dataset of robot failures during human-robot interactions (HRI), encouraging researchers to develop and benchmark multimodal machine learning models to detect these failures. We created a dataset featuring multimodal non-verbal interaction data, including facial, speech, and pose features from video clips of interactions with a robotic coach, annotated with labels indicating the presence or absence of robot mistakes, user awkwardness, and interaction ruptures, allowing for the training and evaluation of predictive models. Challenge participants have been invited to submit their multimodal ML models for detection of robot errors and to be evaluated against various performance metrics such as accuracy, precision, recall, F1 score, with and without a margin of error reflecting the time-sensitivity of these metrics. The results of this challenge will help the research field in better understanding the robot failures in human-robot interactions and designing autonomous robots that can mitigate their own errors after successfully detecting them.
Past, Present, and Future: A Survey of The Evolution of Affective Robotics For Well-being
Spitale, Micol, Axelsson, Minja, Jeong, Sooyeon, Tuttosฤฑ, Paige, Stamatis, Caitlin A., Laban, Guy, Lim, Angelica, Gune, Hatice
Recent research in affective robots has recognized their potential in supporting human well-being. Due to rapidly developing affective and artificial intelligence technologies, this field of research has undergone explosive expansion and advancement in recent years. In order to develop a deeper understanding of recent advancements, we present a systematic review of the past 10 years of research in affective robotics for wellbeing. In this review, we identify the domains of well-being that have been studied, the methods used to investigate affective robots for well-being, and how these have evolved over time. We also examine the evolution of the multifaceted research topic from three lenses: technical, design, and ethical. Finally, we discuss future opportunities for research based on the gaps we have identified in our review -- proposing pathways to take affective robotics from the past and present to the future. The results of our review are of interest to human-robot interaction and affective computing researchers, as well as clinicians and well-being professionals who may wish to examine and incorporate affective robotics in their practices.
Small but Fair! Fairness for Multimodal Human-Human and Robot-Human Mental Wellbeing Coaching
Cheong, Jiaee, Spitale, Micol, Gunes, Hatice
In recent years, the affective computing (AC) and human-robot interaction (HRI) research communities have put fairness at the centre of their research agenda. However, none of the existing work has addressed the problem of machine learning (ML) bias in HRI settings. In addition, many of the current datasets for AC and HRI are "small", making ML bias and debias analysis challenging. This paper presents the first work to explore ML bias analysis and mitigation of three small multimodal datasets collected within both a human-human and robot-human wellbeing coaching settings. The contributions of this work includes: i) being the first to explore the problem of ML bias and fairness within HRI settings; and ii) providing a multimodal analysis evaluated via modelling performance and fairness metrics across both high and low-level features and proposing a simple and effective data augmentation strategy (MixFeat) to debias the small datasets presented within this paper; and iii) conducting extensive experimentation and analyses to reveal ML fairness insights unique to AC and HRI research in order to distill a set of recommendations to aid AC and HRI researchers to be more engaged with fairness-aware ML-based research.
Appropriateness of LLM-equipped Robotic Well-being Coach Language in the Workplace: A Qualitative Evaluation
Spitale, Micol, Axelsson, Minja, Gunes, Hatice
Robotic coaches have been recently investigated to promote mental well-being in various contexts such as workplaces and homes. With the widespread use of Large Language Models (LLMs), HRI researchers are called to consider language appropriateness when using such generated language for robotic mental well-being coaches in the real world. Therefore, this paper presents the first work that investigated the language appropriateness of robot mental well-being coach in the workplace. To this end, we conducted an empirical study that involved 17 employees who interacted over 4 weeks with a robotic mental well-being coach equipped with LLM-based capabilities. After the study, we individually interviewed them and we conducted a focus group of 1.5 hours with 11 of them. The focus group consisted of: i) an ice-breaking activity, ii) evaluation of robotic coach language appropriateness in various scenarios, and iii) listing shoulds and shouldn'ts for designing appropriate robotic coach language for mental well-being. From our qualitative evaluation, we found that a language-appropriate robotic coach should (1) ask deep questions which explore feelings of the coachees, rather than superficial questions, (2) express and show emotional and empathic understanding of the context, and (3) not make any assumptions without clarifying with follow-up questions to avoid bias and stereotyping. These results can inform the design of language-appropriate robotic coach to promote mental well-being in real-world contexts.
"Oh, Sorry, I Think I Interrupted You'': Designing Repair Strategies for Robotic Longitudinal Well-being Coaching
Axelsson, Minja, Spitale, Micol, Gunes, Hatice
Robotic well-being coaches have been shown to successfully promote people's mental well-being. To provide successful coaching, a robotic coach should have the capability to repair the mistakes it makes. Past investigations of robot mistakes are limited to game or task-based, one-off and in-lab studies. This paper presents a 4-phase design process to design repair strategies for robotic longitudinal well-being coaching with the involvement of real-world stakeholders: 1) designing repair strategies with a professional well-being coach; 2) a longitudinal study with the involvement of experienced users (i.e., who had already interacted with a robotic coach) to investigate the repair strategies defined in (1); 3) a design workshop with users from the study in (2) to gather their perspectives on the robotic coach's repair strategies; 4) discussing the results obtained in (2) and (3) with the mental well-being professional to reflect on how to design repair strategies for robotic coaching. Our results show that users have different expectations for a robotic coach than a human coach, which influences how repair strategies should be designed. We show that different repair strategies (e.g., apologizing, explaining, or repairing empathically) are appropriate in different scenarios, and that preferences for repair strategies change during longitudinal interactions with the robotic coach.
VITA: A Multi-modal LLM-based System for Longitudinal, Autonomous, and Adaptive Robotic Mental Well-being Coaching
Spitale, Micol, Axelsson, Minja, Gunes, Hatice
Recently, several works have explored if and how robotic coaches can promote and maintain mental well-being in different settings. However, findings from these studies revealed that these robotic coaches are not ready to be used and deployed in real-world settings due to several limitations that span from technological challenges to coaching success. To overcome these challenges, this paper presents VITA, a novel multi-modal LLM-based system that allows robotic coaches to autonomously adapt to the coachee's multi-modal behaviours (facial valence and speech duration) and deliver coaching exercises in order to promote mental well-being in adults. We identified five objectives that correspond to the challenges in the recent literature, and we show how the VITA system addresses these via experimental validations that include one in-lab pilot study (N=4) that enabled us to test different robotic coach configurations (pre-scripted, generic, and adaptive models) and inform its design for using it in the real world, and one real-world study (N=17) conducted in a workplace over 4 weeks. Our results show that: (i) coachees perceived the VITA adaptive and generic configurations more positively than the pre-scripted one, and they felt understood and heard by the adaptive robotic coach, (ii) the VITA adaptive robotic coach kept learning successfully by personalising to each coachee over time and did not detect any interaction ruptures during the coaching, (iii) coachees had significant mental well-being improvements via the VITA-based robotic coach practice. The code for the VITA system is openly available via: https://github.com/Cambridge-AFAR/VITA-system.
A Systematic Review on Reproducibility in Child-Robot Interaction
Spitale, Micol, Stower, Rebecca, Yadollahi, Elmira, Parreira, Maria Teresa, Abbasi, Nida Itrat, Leite, Iolanda, Gunes, Hatice
Although initially emerging in psychology, many of the concerns raised, such as lack of open access to data, materials, and/or experimental design apply also to other (social) sciences. Among these are both Human Robot Interaction (HRI) and its related sub-field of Child Robot Interaction (CRI), where social and psychological relationships between humans and robots are often the focus of the research. Given its novelty and rapidly evolving progress, CRI in particular suffers from fragmented and heterogeneous literature, varying research goals, and a lack of standardised methods and metrics. Recent efforts have brought forth conversations related to replication specifically within CRI [51, 52], with authors appealing for more works that address the main challenges in HRI with children whilst still ensuring high-quality reporting and data sharing. However, clear open science guidelines on reproducibility in HRI and related sub-fields are still missing.