Grounding New Words on the Physical World in Multi-Domain Human-Robot Dialogues

Nakano, Mikio (Honda Research Institute Japan Co., Ltd.) | Iwahashi, Naoto (ATR Media Information Science Research Laboratories / National Institute of Information and Communications Technology) | Nagai, Takayuki (University of Electro-Communications) | Sumii, Taisuke (ATR Media Information Science Research Laboratories / Kyoto Institute of Technology) | Zuo, Xiang (ATR Media Information Science Research Laboratories / Kyoto Institute of Technology) | Taguchi, Ryo (ATR Media Information Science Research Laboratories / Nagoya Institute of Technology) | Nose, Takashi (ATR Media Information Science Research Laboratories / Tokyo Institute of Technology) | Mizutani, Akira (University of Electro-Communications) | Nakamura, Tomoaki (University of Electro-Communications) | Attamim, Muhanmad (University of Electro-Communications) | Narimatsu, Hiromi (University of Electro-Communications) | Funakoshi, Kotaro (Honda Research Institute Japan Co., Ltd.) | Hasegawa, Yuji (Honda Research Institute Japan Co., Ltd.)

AAAI Conferences 

This paper summarizes our ongoing project on developing an architecture for a robot that can acquire new words and their meanings while engaging in multi-domain dialogues. These two functions are crucial in making conversational service robots work in real tasks in the real world. Household robots and office robots need to be able to work in multiple task domains and they also need to engage in dialogues in multiple domains corresponding to those task domains. Lexical acquisition is necessary because speech understanding cannot be done without enough knowledge on words that are possibly spoken in the task domain. Our architecture is based on a multi-expert model in which multiple domain experts are employed and one of them is selected based on the user utterance and the situation to engage in the control of the dialogue and physical behaviors. We incorporate experts that have an ability to acquire new lexical entries and their meanings grounded on the physical world through spoken interactions. By appropriately selecting those experts, lexical acquisition in multi-domain dialogues becomes possible. An example robotic system based on this architecture that can acquire object names and location names demonstrates the viability of the architecture.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found