Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration

Puig, Xavier, Shu, Tianmin, Li, Shuang, Wang, Zilin, Tenenbaum, Joshua B., Fidler, Sanja, Torralba, Antonio

arXiv.org Artificial Intelligence 

In this paper, we introduce Watch-And-Help (WAH), a challenge for testing social intelligence in agents. In WAH, an AI agent needs to help a humanlike agent perform a complex household task efficiently. To succeed, the AI agent needs to i) understand the underlying goal of the task by watching a single demonstration of the humanlike agent performing the same task (social perception), and ii) coordinate with the humanlike agent to solve the task in an unseen environment as fast as possible (human-AI collaboration). For this challenge, we build VirtualHome-Social, a multi-agent household environment, and provide a benchmark including both planning and learning based baselines. We evaluate the performance of AI agents with the humanlike agent as well as with real humans using objective metrics and subjective user ratings. Experimental results demonstrate that the proposed challenge and virtual environment enable a systematic evaluation on the important aspects of machine social intelligence at scale. Without much prior experience, children can robustly recognize goals of other people by simply watching them act in an environment, and are able to come up with plans to help them, even in novel scenarios. In contrast, the most advanced AI systems to date still struggle with such basic social skills. In order to achieve the level of social intelligence required to effectively help humans, an AI agent should acquire two key abilities: i) social perception, i.e., the ability to understand human behavior, and ii) collaborative planning, i.e., the ability to reason about the physical environment and plan its actions to coordinate with humans. In this paper, we are interested in developing AI agents with these two abilities. Towards this goal, we introduce a new AI challenge, Watch-And-Help (WAH), which focuses on social perception and human-AI collaboration. In this challenge, an AI agent needs to collaborate with a humanlike agent to enable it to achieve the goal faster. In particular, we present a 2-stage framework as shown in Figure 1. In the first, Watch stage, an AI agent (Bob) watches a humanlike agent (Alice) performing a task once and infers Alice's goal from her actions.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found