Reviews: Constrained Cross-Entropy Method for Safe Reinforcement Learning

May-26-2025, 05:38:51 GMT–Neural Information Processing Systems

This paper studies constrained optimal control, where the goal is to produce a policy that maximizes an objective function subject to a constraint. The authors provide great motivation for this setting, explaining why the constraint cannot simply be included as a large negative reward. They detail challenges in solving this problem, especially if the initial policy does not satisfy the constraint. They also note a clever extension of their method, where they use the constraint to define the objective, by setting the constraint to indicate whether the task is solved. Their algorithm builds upon CEM: at each iteration, if there are no feasible policies, they maximize the constraint function for the policies with the largest objective; otherwise, they maximize the objective function for feasible policies.

constrained cross-entropy method, constraint, safe reinforcement learning, (7 more...)

Neural Information Processing Systems

May-26-2025, 05:38:51 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)