Mutual-Information Regularized Multi-Agent Policy Iteration Jiangxing Wang School of Computer Science Peking University

Neural Information Processing Systems 

The cooperative multi-agent reinforcement learning (MARL) problem has attracted the attention of many researchers for being a well-abstracted model for many real-world problems, such as traffic signal control (Wang et al., 2021), autonomous warehouse (Zhou et al., 2021), and even