Multi-AgentReinforcementLearningis ASequenceModelingProblem

Neural Information Processing Systems 

Recently, such difficulty in multi-agent learning has been eased owing to the introduction ofcentralized training for decentralized execution(CTDE) [11, 45], which allows agents to access the global information andopponents' actions during thetraining phase.