A Meta-Game Evaluation Framework for Deep Multiagent Reinforcement Learning
–arXiv.org Artificial Intelligence
Evaluating deep multiagent reinforcement learning In purely adversarial (i.e., two-player zero-sum) environments, (MARL) algorithms is complicated by stochasticity distance to Nash equilibrium may be a sufficient metric in training and sensitivity of agent performance [Brown et al., 2020; Schmid et al., 2023], as all equilibria to the behavior of other agents. We propose a metagame are interchangeably optimal. More generally, where there evaluation framework for deep MARL, by are multiple equilibria or where we do not necessarily expect framing each MARL algorithm as a meta-strategy, equilibrium behavior, the metrics for MARL performance and repeatedly sampling normal-form empirical may be less clear. In collaborative domains, global team return games over combinations of meta-strategies resulting is the common objective [Foerster et al., 2018; Rashid from different random seeds. Each empirical et al., 2020], however complex learning dynamics may lead game captures both self-play and cross-play factors agents using the same MARL algorithm to equilibria of distinct across seeds. These empirical games provide machine conventions in different runs [Hu et al., 2020].
arXiv.org Artificial Intelligence
Apr-30-2024