Enabling Multi-Robot Collaboration from Single-Human Guidance
Ji, Zhengran, Zhang, Lingyu, Sajda, Paul, Chen, Boyuan
–arXiv.org Artificial Intelligence
The best policy achieves an average seeker success rate of 84.2% in simulation and 80% in real-world experiments in a challenging 3 seekers vs 3 hiders setting with random map layouts. In comparison, the baseline policy has only 36.4% in simulation and 55% in real-world. Interesting collaborative behaviors among seekers are observed during deployment, such as strategically navigating to anticipate and intercept hiders or effectively blocking key paths as a team. Abstract -- Learning collaborative behaviors is essential for multi-agent systems. Traditionally, multi-agent reinforcement learning solves this implicitly through a joint reward and centralized observations, assuming collaborative behavior will emerge. Other studies propose to learn from demonstrations of a group of collaborative experts. Instead, we propose an efficient and explicit way of learning collaborative behaviors in multi-agent systems by leveraging expertise from only a single human. Our insight is that humans can naturally take on various roles in a team. We show that agents can effectively learn to collaborate by allowing a human operator to dynamically switch between controlling agents for a short period and incorporating a human-like theory-of-mind model of teammates. Our experiments showed that our method improves the success rate of a challenging collaborative hide-and-seek task by up to 58 % with only 40 minutes of single-human guidance.
arXiv.org Artificial Intelligence
Sep-29-2024