Evolving Constrained Reinforcement Learning Policy

Hu, Chengpeng, Pei, Jiyuan, Liu, Jialin, Yao, Xin

Apr-18-2023–arXiv.org Artificial Intelligence

Evolutionary algorithms have been used to evolve a population of actors to generate diverse experiences for training reinforcement learning agents, which helps to tackle the temporal credit assignment problem and improves the exploration efficiency. However, when adapting this approach to address constrained problems, balancing the trade-off between the reward and constraint violation is hard. In this paper, we propose a novel evolutionary constrained reinforcement learning (ECRL) algorithm, which adaptively balances the reward and constraint violation with stochastic ranking, and at the same time, restricts the policy's behaviour by maintaining a set of Lagrange relaxation coefficients with a constraint buffer. Extensive experiments on robotic control benchmarks show that our ECRL achieves outstanding performance compared to state-of-the-art algorithms. Ablation analysis shows the benefits of introducing stochastic ranking and constraint buffer.

constraint, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Apr-18-2023

arXiv.org PDF

Add feedback

Country:
- Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment > Games (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found