Country
Collaborative Discourse, Engagement and Always-On Relational Agents
Rich, Charles (Worcester Polytechnic Institute) | Sidner, Candace L. (Worcester Polytechnic Institute)
We summarize our past, present and future research related to human-robot dialogue, starting with its foundations in collaborative discourse theory, continuing to our current research on recognizing and generating engagement, and concluding with an outline of new work we are beginning on the modeling of long-term relationships between humans and robots.
Grounding New Words on the Physical World in Multi-Domain Human-Robot Dialogues
Nakano, Mikio (Honda Research Institute Japan Co., Ltd.) | Iwahashi, Naoto (ATR Media Information Science Research Laboratories / National Institute of Information and Communications Technology) | Nagai, Takayuki (University of Electro-Communications) | Sumii, Taisuke (ATR Media Information Science Research Laboratories / Kyoto Institute of Technology) | Zuo, Xiang (ATR Media Information Science Research Laboratories / Kyoto Institute of Technology) | Taguchi, Ryo (ATR Media Information Science Research Laboratories / Nagoya Institute of Technology) | Nose, Takashi (ATR Media Information Science Research Laboratories / Tokyo Institute of Technology) | Mizutani, Akira (University of Electro-Communications) | Nakamura, Tomoaki (University of Electro-Communications) | Attamim, Muhanmad (University of Electro-Communications) | Narimatsu, Hiromi (University of Electro-Communications) | Funakoshi, Kotaro (Honda Research Institute Japan Co., Ltd.) | Hasegawa, Yuji (Honda Research Institute Japan Co., Ltd.)
This paper summarizes our ongoing project on developing an architecture for a robot that can acquire new words and their meanings while engaging in multi-domain dialogues. These two functions are crucial in making conversational service robots work in real tasks in the real world. Household robots and office robots need to be able to work in multiple task domains and they also need to engage in dialogues in multiple domains corresponding to those task domains. Lexical acquisition is necessary because speech understanding cannot be done without enough knowledge on words that are possibly spoken in the task domain. Our architecture is based on a multi-expert model in which multiple domain experts are employed and one of them is selected based on the user utterance and the situation to engage in the control of the dialogue and physical behaviors. We incorporate experts that have an ability to acquire new lexical entries and their meanings grounded on the physical world through spoken interactions. By appropriately selecting those experts, lexical acquisition in multi-domain dialogues becomes possible. An example robotic system based on this architecture that can acquire object names and location names demonstrates the viability of the architecture.
Framework of Communication Activation Robot Participating in Multiparty Conversation
Matsuyama, Yoichi (Waseda University) | Taniyama, Hikaru (Waseda University) | Fujie, Shinya (Waseda University) | Kobayashi, Tetsunori (Waseda University)
We propose a framework for a robot to participate in and activate multiparty conversation. In multiparty conversation, the robot should select its behavior based on both linguistic information and participation structure. In this paper, we focus on multiparty conversation game "Nandoku," which is often played in elderly care facilities. The robot acts as one of the participants, and tries to promote the communication activeness. The framework handles the dialogue situation from three aspects: multiparty conversation, game progress and communication activation, and selects the most effective robot's behavior according to these three aspects.
A Model for Verbal and Non-Verbal Human-Robot Collaboration
Matignon, Laetitia (University of Caen Basse Normandie) | Karami, Abir Beatrice (University of Caen Basse Normandie) | Mouaddib, Abdel-Illah (University of Caen Basse Normandie)
We are motivated by building a system for an autonomous robot companion that collaborates with a human partner for achieving a common mission. The objective of the robot is to infer the human's preferences upon the tasks of the mission so as to collaborate with the human by achieving human's non-favorite tasks. Inspired by recent researches about the recognition of human's intention, we propose a unified model that allows the robot to switch accurately between verbal and non-verbal interactions. Our system unifies an epistemic partially observable Markov decision process (POMDP) that is a human-robot spoken dialog system aiming at disambiguating the human's preferences and an intuitive human-robot collaboration consisting in inferring human's intention based on the observed human actions. The beliefs over human's preferences computed during the dialog are then reinforced in the course of the task execution by the intuitive interaction. Our unified model helps the robot inferring the human's preferences and deciding which tasks to perform to effectively satisfy these preferences. The robot is also able to adjust its plan rapidly in case of sudden changes in the human's preferences and to switch between both kind of interactions. Experimental results on a scenario inspired from robocup@home outline various specific behaviors of the robot during the collaborative mission.
Preparing to Talk: Interaction between a Linguistically Enabled Agent and a Human Teacher
Lyon, Caroline (University of Hertfordshire) | Nehaniv, Chrystopher L. (University of Hertfordshire) | Saunders, Joe (University of Hertfordshire)
As a precursor to learning to use language an infant has to acquire preliminary linguistic skills, including the ability to recognize and produce word forms without meaning. This develops out of babbling, through vocal interaction with carers. We report on evidence from developmental psychology and from neuroscientific research that supports a dual process approach to language learning. We describe a simulation of the transition from babbling to the recognition of first word forms in a simulated robot interacting with a human teacher. This precedes interactions with the real iCub robot.
Ambiguities in Spatial Language Understanding in Situated Human Robot Dialogue
Liu, Changsong (Michigan State University) | Walker, Jacob (Michigan State University) | Chai, Joyce Y. (Michigan State University)
In human robot dialogue, identifying intended referents from human partners’ spatial language is challenging. This is partly due to automated inference of potentially ambiguous underlying reference system (i.e., frame of reference ). To improve spatial language understanding, we conducted an empirical study to investigate the prevalence of ambiguities of frame of reference. Our findings indicate that ambiguities do arise frequently during human robot dialogues. Although situational factors from the spatial arrangement are less indicative for the underlying reference system, linguistic cues and individual preferences may allow reliable disambiguation.
Robots that Learn to Communicate: A Developmental Approach to Personally and Physically Situated Human-Robot Conversations
Iwahashi, Naoto (National Institute of Information and Communications Technology) | Sugiura, Komei (National Institute of Information and Communications Technology) | Taguchi, Ryo (Nagoya Institute of Technology) | Nagai, Takayuki (University of Electyro-Communications) | Taniguchi, Tadahiro (Ritsumeika University)
This paper summarizes the online machine learning method LCore, which enables robots to learn to communicate with users from scratch through verbal and behavioral interaction in the physical world. LCore combines speech, visual, and tactile information obtained through the interaction, and enables robots to learn beliefs regarding speech units, words, the concepts of objects, motions, grammar, and pragmatic and communicative capabilities. The overall belief system is represented by a dynamic graphical model in an integrated way. Experimental results show that through a small, practical number of learning episodes with a user, the robot was eventually able to understand even fragmental and ambiguous utterances, respond to them with confirmation questions and/or actions, generate directive utterances, and answer questions, appropriately for the given situation. This paper discusses the importance of a developmental approach to realize personally and physically situated human-robot conversations.
The Role of Embodiment and Perspective in Direction-Giving Systems
Hasegawa, Dai (Hokkaido University) | Cassell, Justine (Carnegie Mellon University) | Araki, Kenji (Hokkaido University)
In this paper, we describe an evaluation of the impact of embodiment, the effect of different kinds of embodiment, and the benefits of different aspects of embodiment, on direction-giving systems. We compared a robot, embodied conversational agent (ECA), and GPS giving directions, when these systems used speaker-perspective gestures, listener-perspective gestures and no gestures. Results demonstrated that, while there was no difference in direction-giving performance between the robot and the ECA, and little difference in participants’perceptions, there was a considerable effect of the type of gesture employed, and several interesting interactions between type of embodiment and aspects of embodiment.
Natural Programming of a Social Robot by Dialogs
Gorostiza, Javi F. (Universidad Carlos III de Madrid) | Salichs, Miguel A. (Universidad Carlos III de Madrid)
This paper aims at bringing social robots closer to naive users. A Natural Programming System that allows the end-user to give instructions to a Social Robot has been developed. The instructions derive in a sequence of actions and conditions, that can be executed while the own sequence verbal edition continues. A Dialogue Manager System (DMS) has been developed in a Social Robot. The dialog is described in a voiceXML structure, where a set of information slots is defined. These slots are related to the necessary attributes for the construction of the sequence in execution time. The robot can make specific requests on encountering unfilled slots. Temporal aspects of dialog such as barge-in property, mixed-initiative, or speech intonation control are also considered. Dialog flow is based on Dialog Acts. The dialog specification has also been extended for multimodality management. The presented DMS has been used as a part of a Natural Programming System but can also be used for other multimodal humanrobot interactive skills.
Acquiring Vocabulary through Human Robot Interaction: A Learning Architecture for Grounding Words with Multiple Meanings
Chauhan, Aneesh (Universidade de Aveiro) | Lopes, Luís Seabra (Universidade de Aveiro)
This paper presents a robust methodology for grounding vocabulary in robots. A social language grounding experiment is designed, where, a human instructor teaches a robotic agent the names of the objects present in a visually shared environment. Any system for grounding vocabulary has to incorporate the properties of gradual evolution and lifelong learning. The learning model of the robot is adopted from an ongoing work on developing systems that conform to these properties. Significant modifications have been introduced to the adopted model, especially to handle words with multiple meanings. A novel classification strategy has been developed for improving the performance of each classifier for each learned category. A set of six new nearest-neighbor based classifiers have also been integrated into the agent architecture. A series of experiments were conducted to test the performance of the new model on vocabulary acquisition. The robot was shown to be robust at acquiring vocabulary and has the potential to learn a far greater number of words (with either single or multiple meanings).