Mutual-Information Regularized Multi-Agent Policy Iteration Jiangxing Wang School of Computer Science Peking University