Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
–Neural Information Processing Systems
Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play.
Neural Information Processing Systems
Dec-31-2003