Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems
–Neural Information Processing Systems
Reinforcement Learning (RL) is a powerful method for controlling dynamic systems, but its learning mechanism can lead to unpredictable actions that undermine the safety of critical systems. Here, we propose RL with Adaptive Regularization (RL-AR), an algorithm that enables safe RL exploration by combining the RL policy with a policy regularizer that hard-codes the safety constraints. RL-AR performs policy combination via a "focus module," which determines the appropriate combination depending on the state--relying more on the safe policy regularizer for less-exploited states while allowing unbiased convergence for well-exploited states. In a series of critical control applications, we demonstrate that RL-AR not only ensures safety during training but also achieves a return competitive with the standards of model-free RL that disregards safety.
Neural Information Processing Systems
May-28-2025, 07:46:51 GMT
- Country:
- Europe > United Kingdom (0.28)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Energy (0.95)
- Health & Medicine > Therapeutic Area
- Endocrinology > Diabetes (1.00)
- Technology: