Learning to Cooperate with Humans using Generative Agents

Neural Information Processing Systems 

Training agents that can coordinate zero-shot with humans is a key mission in multi-agent reinforcement learning (MARL). Current algorithms focus on training simulated human partner policies which are then used to train a Cooperator agent. The simulated human is produced either through behavior cloning over a dataset of human cooperation behavior, or by using MARL to create a population of simulated agents. However, these approaches often struggle to produce a Cooperator that can coordinate well with real humans, since the simulated humans fail to cover the diverse strategies and styles employed by people in the real world. We show learning a generative model of human partners can effectively address this issue.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found