Scalable Constrained Policy Optimization for Safe Multi-agent Reinforcement Learning

Neural Information Processing Systems 

A challenging problem in seeking to bring multi-agent reinforcement learning (MARL) techniques into real-world applications, such as autonomous driving and drone swarms, is how to control multiple agents safely and cooperatively to accomplish tasks.