Asia
Using Watson for Enhancing Human-Computer Co-Creativity
Goel, Ashok (Georgia Institute of Technology) | Creeden, Brian (Georgia Institute of Technology) | Kumble, Mithun (Georgia Institute of Technology) | Salunke, Shanu (Georgia Institute of Technology) | Shetty, Abhinaya (Georgia Institute of Technology) | Wiltgen, Bryan (Georgia Institute of Technology)
We describe an experiment in using IBM’s Watson cognitive system to teach about human-computer co-creativity in a Georgia Tech Spring 2015 class on computational creativity. The project-based class used Watson to support biologically inspired design, a design paradigm that uses biological systems as analogues for inventing technological systems. The twenty-four students in the class self-organized into six teams of four students each, and developed semester-long projects that built on Watson to support biologically inspired design. In this paper, we describe this experiment in using Watson to teach about human-computer co-creativity, present one project in detail, and summarize the remaining five projects. We also draw lessons on building on Watson for (i) supporting biologically inspired design, and (ii) enhancing human-computer co-creativity.
Represent and Infer Human Theory of Mind for Human-Robot Interaction
Zhao, Yibiao (University of California, Los Angeles) | Holtzen, Steven (University of California, Los Angeles) | Gao, Tao (University of California, Los Angeles) | Zhu, Song-Chun (University of California, Los Angeles)
This abstract is proposing a challenging problem: to infer a human's mental state — intent and belief — from an observed RGBD video for human-robot interaction. The task is to integrate symbolic reasoning, a field well-studied within A.I. domains, with the uncertainty native to computer vision strategies. Traditional A.I. strategies for plan inference typically rely on first-order logic and closed world assumptions which struggle to take into account the inherent uncertainty of noisy observations within a scene. Computer vision relies on pattern-recognition strategies that have difficulty accounting for higher-level reasoning and abstract representation of world knowledge. By combining these two approaches in a principled way under a probabilistic programming framework, we define new computer vision tasks such as actor intent prediction and belief inference from an observed video sequence. Through inferring a human's theory of mind, a robotic agent can automatically determine a human's goals to collaborate with them.
Missteps in Robot Social Navigation
Sutcliffe, Andrew (McGill University) | Tenenholtz, Neil (Vecna Technologies) | Pineau, Joelle (McGill University)
Assessing the quality of robot social navigation is a challenging problem fraught with human obstacles. From preconceived notions to perspective or point of view, evaluations can differ from person to person. Most work in the field of robot navigation is focused on creating algorithms that produce efficient robot trajectories. We posit that the evaluation of trajectories in a social context is essential and distinct to trajectory generation. In this work we recorded a manually driven powered wheelchair through different scenarios and asked expert evaluators to assess the quality of the powered wheelchair's movement. These evaluations were then compared to post-experiment assessments from trajectory generation algorithms and social navigation concepts. Our results show that it is possible to build a simple model to predict expert evaluators' responses. Unfortunately, there is no clear consensus amongst these experts on what quality behaviour is. This suggests that while current navigation algorithms offer strong heuristics for the generation of smooth trajectories in well-defined environments, their efficacy in evaluating social navigation is less obvious. We believe that more emphasis must be put on dynamic and reactive navigation algorithms as any heuristic approach will be limited due to variance in people's behaviours and expectations.
Towards Gaze and Gesture Based Human-Robot Interaction for Dementia Patients
Prange, Alexander (German Research Center for Artificial Intelligence (DFKI)) | Toyama, Takumi (German Research Center for Artificial Intelligence (DFKI)) | Sonntag, Daniel (German Research Center for Artificial Intelligence (DFKI))
Modeling Situated Conversations for a Child-Care Robot Using Wearable Devices
On, Kyoung-Woon (Seoul National University) | Kim, Eun-Sol (Seoul National University) | Zhang, Byoung-Tak (Seoul National University)
How can robots fluently communicate with humans and have context-preserving conversation? It is the most momentous and crucial problem in robotics research, especially for service robots such as child-care robots. Here, we aim to develop a situated conversation system for child-care robots. The conversation system considers the current context between robots and children as well as the situation the child is in. The system consists of two parts. The first part tries to understand the context. This part uses the embedded sensors of robots to understand the context and wearable sensors of the child for getting information of the situation the child is in. The second part is to generate the situated conversation. In terms of the model, we designed a hierarchical Bayesian Network for the first part and a Hypernetwork model is used for the second part. We illustrate the application of communication with a child in a child-care service robots scenario. For this application, we collect wearable sensors’ data from the child and mother-child conversation data in daily life. Finally, we discuss our results and future works.
Agent Requirements for Effective and Efficient Task-Oriented Dialog
Mohan, Shiwali (PARC) | Kirk, James Roberts (The University of Michigan) | Mininger, Aaron (The University of Michigan) | Laird, John (The University of Michigan)
Dialog is a useful way for a robotic agent performing a task to communicate with a human collaborator, as it is a rich source of information for both the agent and the human. Such task-oriented dialog provides a medium for commanding, informing, teaching, and correcting a robot. Robotic agents engaging in dialog must be able to interpret a wide variety of sentences and supplement the dialog with information from its context, history, learned knowledge, and from non-linguistic interactions. We have identified a set of nine system-level requirements for such agents that help them support more effective, efficient, and general task-oriented dialog. This set is inspired by our research in Interactive Task Learning with a robotic agent named Rosie. This paper defines each requirement and gives examples of work we have done that illustrates them.
Pororobot: A Deep Learning Robot That Plays Video Q&A Games
Kim, Kyung-Min (Seoul National University) | Nan, Chang-Jun (Seoul National University) | Ha, Jung-Woo (NAVER Corporation) | Heo, Yu-Jung (School of Computer Science and Engineering, Seoul National University) | Zhang, Byoung-Tak (Seoul National University)
Recent progress in machine learning has lead to great advancements in robot intelligence and human-robot interaction (HRI). It is reported that robots can deeply understand visual scene information and describe the scenes in natural language using object recognition and natural language processing methods. Image-based question and answering (Q&A) systems can be used for enhancing HRI. However, despite these successful results, several key issues still remain to be discussed and improved. In particular, it is essential for an agent to act in a dynamic, uncertain, and asynchronous envi-ronment for achieving human-level robot intelligence. In this paper, we propose a prototype system for a video Q&A robot “Pororobot”. The system uses the state-of-the-art machine learning methods such as a deep concept hierarchy model. In our scenario, a robot and a child plays a video Q&A game together under real world environments. Here we demonstrate preliminary results of the proposed system and discuss some directions as future works.
Temporal and Object Relations in Unsupervised Plan and Activity Recognition
Freedman, Richard G. (University of Massachusetts Amherst) | Jung, Hee-Tae (University of Massachusetts Amherst) | Zilberstein, Shlomo (University of Massachusetts Amherst)
We consider ways to improve the performance of unsupervised plan and activity recognition techniques by considering temporal and object relations in addition to postural data. Temporal relationships can help recognize activities with cyclic structure and are often implicit because plans have degrees of ordering actions. Relations with objects can help disambiguate observed activities that otherwise share a user's posture and position. We develop and investigate graphical models that extend the popular latent Dirichlet allocation approach with temporal and object relations, examine the relative performance and runtime trade-offs using a standard dataset, and consider the cost/benefit trade-offs these extensions offer in the context of human-robot and humancomputer interaction.
Integration of Planning with Plan Recognition Using Classical Planners (Extended Abstract)
Freedman, Richard G. (University of Massachusetts Amherst) | Fukunaga, Alex (The University of Tokyo)
In order for robots to interact with humans in the world around them, it is important that they are not just aware of the presence of people, but also able to understand what those people are doing. In particular, interaction involves multiple agents which requires some form of coordination, and this cannot be achieved by acting blindly. The field of plan recognition (PR) studies methods for identifying an observed agent’s task or goal given her action sequence. This is often regarded as the inverse of planning which, given a set of goal conditions, aims to derive a sequence of actions that will achieve the goals when performed from a given initial state. Ram´ırez and Geffner (2009; 2010) proposed a simple transformation of PR problems into classical planning problems for which off-the-shelf software is available for quick and efficient implementations. However, there is a reliance on the observed agent’s optimality which makes this PR technique most useful as a post-processing step when some of the final actions are observed. In human-robot interaction (HRI), it is usually too late to interact once the humans are finished performing their tasks. In this paper, we describe ongoing work two extensions to make classical planning-based PR more applicable to the field of HRI. First, we introduce a modification to their algorithm that reduces the optimality bias’s effect so that long-term goals may be recognized at earlier observations. This is then followed by methods for extracting information from these predictions so that the observing agent may run a second pass of the planner to determine its own actions to perform for a fully interactive system.