A Meta-Game Evaluation Framework for Deep Multiagent Reinforcement Learning

Li, Zun, Wellman, Michael P.

arXiv.org Artificial Intelligence 

Evaluating deep multiagent reinforcement learning In purely adversarial (i.e., two-player zero-sum) environments, (MARL) algorithms is complicated by stochasticity distance to Nash equilibrium may be a sufficient metric in training and sensitivity of agent performance [Brown et al., 2020; Schmid et al., 2023], as all equilibria to the behavior of other agents. We propose a metagame are interchangeably optimal. More generally, where there evaluation framework for deep MARL, by are multiple equilibria or where we do not necessarily expect framing each MARL algorithm as a meta-strategy, equilibrium behavior, the metrics for MARL performance and repeatedly sampling normal-form empirical may be less clear. In collaborative domains, global team return games over combinations of meta-strategies resulting is the common objective [Foerster et al., 2018; Rashid from different random seeds. Each empirical et al., 2020], however complex learning dynamics may lead game captures both self-play and cross-play factors agents using the same MARL algorithm to equilibria of distinct across seeds. These empirical games provide machine conventions in different runs [Hu et al., 2020].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found