Scalable Constrained Policy Optimization for Safe Multi-agent Reinforcement Learning
–Neural Information Processing Systems
A challenging problem in seeking to bring multi-agent reinforcement learning (MARL) techniques into real-world applications, such as autonomous driving and drone swarms, is how to control multiple agents safely and cooperatively to accomplish tasks. Most existing safe MARL methods learn the centralized value function by introducing a global state to guide safety cooperation.
Neural Information Processing Systems
Mar-22-2026, 22:18:54 GMT
- Technology: