R2BC: Multi-Agent Imitation Learning from Single-Agent Demonstrations
Mattson, Connor, Raveendra, Varun, Novoseller, Ellen, Waytowich, Nicholas, Lawhern, Vernon J., Brown, Daniel S.
–arXiv.org Artificial Intelligence
Round-Robin Behavior Cloning (R2BC): Traditional Behavior Cloning (left) requires coordinated and centralized demonstrations, where an expert demonstrates actions near-optimally for all agents. In multi-agent domains, a lone human operator may not be able to provide high-quality demonstrations due to underactuated control and increased cognitive burden. Our method (right), R2BC, removes this restriction by letting the human control one agent at a time while the other agents act via their learned policies. This round-robin process collects realistic demonstrations and iteratively trains cooperative multi-agent behavior. Abstract-- Imitation Learning (IL) is a natural way for humans to teach robots, particularly when high-quality demonstrations are easy to obtain. While IL has been widely applied to single-robot settings, relatively few studies have addressed the extension of these methods to multi-agent systems, especially in settings where a single human must provide demonstrations to a team of collaborating robots. In this paper, we introduce and study Round-Robin Behavior Cloning (R2BC), a method that enables a single human operator to effectively train multi-robot systems through sequential, single-agent demonstrations. Our approach allows the human to teleoperate one agent at a time and incrementally teach multi-agent behavior to the entire system, without requiring demonstrations in the joint multi-agent action space.
arXiv.org Artificial Intelligence
Oct-22-2025
- Country:
- North America > United States (0.28)
- Genre:
- Research Report > New Finding (0.95)
- Industry:
- Energy (0.46)
- Technology: