Quantifying Zero-shot Coordination Capability with Behavior Preferring Partners
Wang, Xihuai, Zhang, Shao, Zhang, Wenhao, Dong, Wentao, Chen, Jingxiao, Wen, Ying, Zhang, Weinan
–arXiv.org Artificial Intelligence
Zero-shot coordination (ZSC) is a new challenge focusing on generalizing learned coordination skills to unseen partners. Existing methods train the ego agent with partners from pre-trained or evolving populations. The agent's ZSC capability is typically evaluated with a few evaluation partners, including humans and agents, and reported by mean returns. Current evaluation methods for ZSC capability still need improvement in constructing diverse evaluation partners and comprehensively measuring ZSC capability. In this paper, we aim to create a reliable, comprehensive, and efficient evaluation method for ZSC capability. We formally define the ideal'diversity-complete' evaluation partners and propose the best response (BR) diversity, which is the population diversity of the BRs to the partners, to approximate the ideal evaluation partners. We propose an evaluation workflow including'diversity-complete' evaluation partners construction and a multidimensional metric, the Best Response Proximity (BR-Prox) metric. We re-evaluate strong ZSC methods in the Overcooked environment using the proposed evaluation workflow. Surprisingly, the results in some of the most used layouts fail to distinguish the performance of different ZSC methods. Moreover, the evaluated ZSC methods lack the ability to produce enough diverse and high-performing training partners. Our proposed evaluation workflow calls for a change in how we efficiently evaluate ZSC methods as a supplement to human evaluation. Zero-shot Coordination (ZSC) is a new challenge in training an agent named ego agent to have the capability to coordinate with unseen partners in cooperative AI (Hu et al., 2020).
arXiv.org Artificial Intelligence
Oct-8-2023
- Country:
- Oceania > New Zealand (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Workflow (0.75)
- Industry:
- Leisure & Entertainment > Games (1.00)
- Technology: