Oertel, Catharine
What Can You Say to a Robot? Capability Communication Leads to More Natural Conversations
Reimann, Merle M., Hindriks, Koen V., Kunneman, Florian A., Oertel, Catharine, Skantze, Gabriel, Leite, Iolanda
When encountering a robot in the wild, it is not inherently clear to human users what the robot's capabilities are. When encountering misunderstandings or problems in spoken interaction, robots often just apologize and move on, without additional effort to make sure the user understands what happened. We set out to compare the effect of two speech based capability communication strategies (proactive, reactive) to a robot without such a strategy, in regard to the user's rating of and their behavior during the interaction. For this, we conducted an in-person user study with 120 participants who had three speech-based interactions with a social robot in a restaurant setting. Our results suggest that users preferred the robot communicating its capabilities proactively and adjusted their behavior in those interactions, using a more conversational interaction style while also enjoying the interaction more.
Introducing MeMo: A Multimodal Dataset for Memory Modelling in Multiparty Conversations
Tsfasman, Maria, Dudzik, Bernd, Fenech, Kristian, Lorincz, Andras, Jonker, Catholijn M., Oertel, Catharine
Conversational memory is the process by which humans encode, retain and retrieve verbal, non-verbal and contextual information from a conversation. Since human memory is selective, differing recollections of the same events can lead to misunderstandings and misalignments within a group. Yet, conversational facilitation systems, aimed at advancing the quality of group interactions, usually focus on tracking users' states within an individual session, ignoring what remains in each participant's memory after the interaction. Understanding conversational memory can be used as a source of information on the long-term development of social connections within a group. This paper introduces the MeMo corpus, the first conversational dataset annotated with participants' memory retention reports, aimed at facilitating computational modelling of human conversational memory. The MeMo corpus includes 31 hours of small-group discussions on Covid-19, repeated 3 times over the term of 2 weeks. It integrates validated behavioural and perceptual measures, audio, video, and multimodal annotations, offering a valuable resource for studying and modelling conversational memory and group dynamics. By introducing the MeMo corpus, analysing its validity, and demonstrating its usefulness for future research, this paper aims to pave the way for future research in conversational memory modelling for intelligent system development.
A Survey on Dialogue Management in Human-Robot Interaction
Reimann, Merle M., Kunneman, Florian A., Oertel, Catharine, Hindriks, Koen V.
Social robots are robots that are designed specifically to interact with their human users [14] for example by using spoken dialogue. For social robots, the interaction with humans plays a crucial role [7, 27], for example in the context of elderly care [15] or education [9]. Robots that use speech as a main mode of interaction do not only need to understand the user's utterances, but also need to select appropriate responses given the context. Dialogue management (DM), according to Traum and Larsson [88], is the part of a dialogue system that performs four key functions: 1) it maintains and updates the context of the dialogue, 2) it includes the context of the utterance for interpretation of input, 3) it selects the timing and content of the next utterance, and 4) it coordinates with (non-)dialogue modules. In spoken dialogue systems, the dialogue manager receives its input from a natural language understanding (NLU) module and forwards its results to a natural language generation (NLG) module, which then generates the output (see Figure 1). In contrast to general DM, DM in human-robot interaction (HRI) has to also consider and manage the complexity added by social robots (see Figure 1). The concentric circles of the figure describe decisions that have to be made when designing a dialogue manager for human-robot interaction. From each circle, one or more options can be chosen and combined with each other.
Perceived personality state estimation in dyadic and small group interaction with deep learning methods
Fenech, Kristian, Fodor, Ádám, Bergeron, Sean P., Saboundji, Rachid R., Oertel, Catharine, Lőrincz, András
Dyadic and small group collaboration is an evolutionary advantageous behaviour and the need for such collaboration is a regular occurrence in day to day life. In this paper we estimate the perceived personality traits of individuals in dyadic and small groups over thin-slices of interaction on four multimodal datasets. We find that our transformer based predictive model performs similarly to human annotators tasked with predicting the perceived big-five personality traits of participants. Using this model we analyse the estimated perceived personality traits of individuals performing tasks in small groups and dyads. Permutation analysis shows that in the case of small groups undergoing collaborative tasks, the perceived personality of group members clusters, this is also observed for dyads in a collaborative problem solving task, but not in dyads under non-collaborative task settings. Additionally, we find that the group level average perceived personality traits provide a better predictor of group performance than the group level average self-reported personality traits.