Non-monotonic Value Function Factorization for Deep Multi-Agent Reinforcement Learning
–arXiv.org Artificial Intelligence
In this paper, we propose actor-critic approaches by introducing an actor policy on QMIX [9], which can remove the monotonicity constraint of QMIX and implement a non-monotonic value function factorization for joint action-value. We evaluate our actor-critic methods on StarCraft II micromanagement tasks, and show that it has a stronger performance on maps with heterogeneous agent types.
arXiv.org Artificial Intelligence
Apr-18-2021