Cassell, Justine
Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units
Mohapatra, Biswesh, Hassan, Seemab, Romary, Laurent, Cassell, Justine
Successful conversations often rest on common understanding, where all parties are on the same page about the information being shared. This process, known as conversational grounding, is crucial for building trustworthy dialog systems that can accurately keep track of and recall the shared information. The proficiencies of an agent in grounding the conveyed information significantly contribute to building a reliable dialog system. Despite recent advancements in dialog systems, there exists a noticeable deficit in their grounding capabilities. Traum (Traum, 1995) provided a framework for conversational grounding introducing Grounding Acts and Grounding Units, but substantial progress, especially in the realm of Large Language Models, remains lacking. To bridge this gap, we present the annotation of two dialog corpora employing Grounding Acts, Grounding Units, and a measure of their degree of grounding. We discuss our key findings during the annotation and also provide a baseline model to test the performance of current Language Models in categorizing the grounding acts of the dialogs. Our work aims to provide a useful resource for further research in making conversations with machines better understood and more reliable in natural day-to-day collaborative dialogs.
When to generate hedges in peer-tutoring interactions
Abulimiti, Alafate, Clavel, Chloé, Cassell, Justine
This paper explores the application of machine learning techniques to predict where hedging occurs in peer-tutoring interactions. The study uses a naturalistic face-to-face dataset annotated for natural language turns, conversational strategies, tutoring strategies, and nonverbal behaviours. These elements are processed into a vector representation of the previous turns, which serves as input to several machine learning models. Results show that embedding layers, that capture the semantic information of the previous turns, significantly improves the model's performance. Additionally, the study provides insights into the importance of various features, such as interpersonal rapport and nonverbal behaviours, in predicting hedges by using Shapley values for feature explanation. We discover that the eye gaze of both the tutor and the tutee has a significant impact on hedge prediction. We further validate this observation through a follow-up ablation study.
How About Kind of Generating Hedges using End-to-End Neural Models?
Abulimiti, Alafate, Clavel, Chloé, Cassell, Justine
Hedging is a strategy for softening the impact of a statement in conversation. In reducing the strength of an expression, it may help to avoid embarrassment (more technically, ``face threat'') to one's listener. For this reason, it is often found in contexts of instruction, such as tutoring. In this work, we develop a model of hedge generation based on i) fine-tuning state-of-the-art language models trained on human-human tutoring data, followed by ii) reranking to select the candidate that best matches the expected hedging strategy within a candidate pool using a hedge classifier. We apply this method to a natural peer-tutoring corpus containing a significant number of disfluencies, repetitions, and repairs. The results show that generation in this noisy environment is feasible with reranking. By conducting an error analysis for both approaches, we reveal the challenges faced by systems attempting to accomplish both social and task-oriented goals in conversation.
"You might think about slightly revising the title": identifying hedges in peer-tutoring interactions
Raphalen, Yann, Clavel, Chloé, Cassell, Justine
Hedges play an important role in the management of conversational interaction. In peer tutoring, they are notably used by tutors in dyads (pairs of interlocutors) experiencing low rapport to tone down the impact of instructions and negative feedback. Pursuing the objective of building a tutoring agent that manages rapport with students in order to improve learning, we used a multimodal peer-tutoring dataset to construct a computational framework for identifying hedges. We compared approaches relying on pre-trained resources with others that integrate insights from the social science literature. Our best performance involved a hybrid approach that outperforms the existing baseline while being easier to interpret. We employ a model explainability tool to explore the features that characterize hedges in peer-tutoring conversations, and we identify some novel features, and the benefits of such a hybrid model approach.
The Role of Embodiment and Perspective in Direction-Giving Systems
Hasegawa, Dai (Hokkaido University) | Cassell, Justine (Carnegie Mellon University) | Araki, Kenji (Hokkaido University)
In this paper, we describe an evaluation of the impact of embodiment, the effect of different kinds of embodiment, and the benefits of different aspects of embodiment, on direction-giving systems. We compared a robot, embodied conversational agent (ECA), and GPS giving directions, when these systems used speaker-perspective gestures, listener-perspective gestures and no gestures. Results demonstrated that, while there was no difference in direction-giving performance between the robot and the ECA, and little difference in participants’perceptions, there was a considerable effect of the type of gesture employed, and several interesting interactions between type of embodiment and aspects of embodiment.
Embodied Conversational Agents: Representation and Intelligence in User Interfaces
Cassell, Justine
The rubric representation covers at least three topics in this context: (1) how a computational system is represented in its user interface, (2) how the interface conveys its representations of information and the world to human users, and (3) how the system's internal representation affects the human user's interaction with the system. I argue that each of these kinds of representation (of the system, information and the world, the interaction) is key to how users make the kind of attributions of intelligence that facilitate their interactions with intelligent systems. In this vein, it makes sense to represent a systmem as a human in those cases where social collaborative behavior is key and for the system to represent its knowledge to humans in multiple ways on multiple modalities. I demonstrate these claims by discussing issues of representation and intelligence in an embodied conversational agent -- an interface in which the system is represented as a person, information is conveyed to human users by multiple modalities such as voice and hand gestures, and the internal representation is modality independent and both propositional and nonpropositional.
Embodied Conversational Agents: Representation and Intelligence in User Interfaces
Cassell, Justine
How do we decide how to represent an intelligent system in its interface, and how do we decide how the interface represents information about the world and about its own workings to a user? This article addresses these questions by examining the interaction between representation and intelligence in user interfaces. The rubric representation covers at least three topics in this context: (1) how a computational system is represented in its user interface, (2) how the interface conveys its representations of information and the world to human users, and (3) how the system's internal representation affects the human user's interaction with the system. I argue that each of these kinds of representation (of the system, information and the world, the interaction) is key to how users make the kind of attributions of intelligence that facilitate their interactions with intelligent systems. In this vein, it makes sense to represent a systmem as a human in those cases where social collaborative behavior is key and for the system to represent its knowledge to humans in multiple ways on multiple modalities. I demonstrate these claims by discussing issues of representation and intelligence in an embodied conversational agent -- an interface in which the system is represented as a person, information is conveyed to human users by multiple modalities such as voice and hand gestures, and the internal representation is modality independent and both propositional and nonpropositional.