An Efficient End-to-End Training Approach for Zero-Shot Human-AI Coordination
–Neural Information Processing Systems
The goal of zero-shot human-AI coordination is to develop an agent that can collaborate with humans without relying on human data. Prevailing two-stage population-based methods require a diverse population of mutually distinct policies to simulate diverse human behaviors. The necessity of such populations severely limits their computational efficiency. To address this issue, we propose E3T, an Efficient End-to-End Training approach for zero-shot human-AI coordination. E3T employs a mixture of ego policy and random policy to construct the partner policy, making it both coordination-skilled and diverse.
Neural Information Processing Systems
Oct-9-2024, 11:55:46 GMT
- Technology: