An Efficient End-to-End Training Approach for Zero-Shot Human-AI Coordination

Neural Information Processing Systems 

The goal of zero-shot human-AI coordination is to develop an agent that can collaborate with humans without relying on human data. Prevailing two-stage population-based methods require a diverse population of mutually distinct policies to simulate diverse human behaviors. The necessity of such populations severely limits their computational efficiency.