Continuous Soft Actor-Critic: An Off-Policy Learning Method Robust to Time Discretization
–Neural Information Processing Systems
Many Deep Reinforcement Learning (DRL) algorithms are sensitive to time discretization, which reduces their performance in real-world scenarios. We propose Continuous Soft Actor-Critic, an off-policy actor-critic DRL algorithm in continuous time and space. It is robust to environment time discretization. We also extend the framework to multi-agent scenarios. This Multi-Agent Reinforcement Learning (MARL) algorithm is suitable for both competitive and cooperative settings.
Neural Information Processing Systems
Jun-18-2026, 03:28:04 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.93)
- Research Report
- Industry:
- Information Technology (0.92)
- Leisure & Entertainment > Games (0.92)
- Technology: