Locally Constrained Policy Optimization for Online Reinforcement Learning in Non-Stationary Input-Driven Environments

Hamadanian, Pouya, Nasr-Esfahany, Arash, Sen, Siddartha, Schwarzkopf, Malte, Alizadeh, Mohammad

Feb-4-2023–arXiv.org Artificial Intelligence

We study online Reinforcement Learning (RL) in non-stationary input-driven environments, where a time-varying exogenous input process affects the environment dynamics. Online RL is challenging in such environments due to catastrophic forgetting (CF). The agent tends to forget prior knowledge as it trains on new experiences. Prior approaches to mitigate this issue assume task labels (which are often not available in practice) or use off-policy methods that can suffer from instability and poor performance. We present Locally Constrained Policy Optimization (LCPO), an on-policy RL approach that combats CF by anchoring policy outputs on old experiences while optimizing the return on current experiences. To perform this anchoring, LCPO locally constrains policy optimization using samples from experiences that lie outside of the current input distribution. We evaluate LCPO in two gym and computer systems environments with a variety of synthetic and real input traces, and find that it outperforms state-of-the-art on-policy and off-policy RL methods in the online setting, while achieving results on-par with an offline agent pre-trained on the whole input trace.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Feb-4-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Rhode Island > Providence County
    - Providence (0.04)
  - New York
    - New York County > New York City (0.14)
    - Richmond County > New York City (0.04)
    - Queens County > New York City (0.04)
    - Kings County > New York City (0.04)
    - Bronx County > New York City (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.28)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.83)
- Instructional Material > Online (0.61)

Industry:
- Education > Educational Setting (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found