Goto

Collaborating Authors

 qtot



65338cfb603d4871a2c38e53a3e039c9-Supplemental-Conference.pdf

Neural Information Processing Systems

Table 1: Payoff matrix of the one-step multi-state non-monotonic cooperative matrix game and reconstructed resultsfromcorresponding baselines.



WeightedQMIX: ExpandingMonotonicValue FunctionFactorisationforDeepMulti-Agent ReinforcementLearning

Neural Information Processing Systems

In this paradigm of centralised training for decentralised execution, QMIX [25] is a popular Qlearning algorithm with state-of-the-art performance ontheStarCraft Multi-Agent Challenge [26]. QMIX represents the optimal joint action value function using a monotonicmixing function of per-agent utilities.