Maintaining Plasticity in Reinforcement Learning: A Cost-Aware Framework for Aerial Robot Control in Non-stationary Environments

Karasahin, Ali Tahir, Wu, Ziniu, Kocer, Basaran Bahadir

Mar-10-2025–arXiv.org Artificial Intelligence

Reinforcement learning (RL) has demonstrated the ability to maintain the plasticity of the policy throughout short-term training in aerial robot control. However, these policies have been shown to loss of plasticity when extended to long-term learning in non-stationary environments. For example, the standard proximal policy optimization (PPO) policy is observed to collapse in long-term training settings and lead to significant control performance degradation. To address this problem, this work proposes a cost-aware framework that uses a retrospective cost mechanism (RECOM) to balance rewards and losses in RL training with a non-stationary environment. Using a cost gradient relation between rewards and losses, our framework dynamically updates the learning rate to actively train the control policy in a disturbed wind environment. Our experimental results show that our framework learned a policy for the hovering task without policy collapse in variable wind conditions and has a successful result of 11.29% less dormant units than L2 regularization with PPO.

machine learning, recom, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Mar-10-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)

Genre:
- Research Report > New Finding (0.88)

Industry:
- Energy > Oil & Gas (0.46)
- Health & Medicine > Therapeutic Area
  - Neurology (0.31)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Robots (1.00)