Webber, Frederick Charles
Enhancing Multi-Objective Reinforcement Learning with Concept Drift
Webber, Frederick Charles (United States Air Force Research Laboratory) | Peterson, Gilbert (Air Force Institute of Technology)
Reinforcement learning (RL) is a particular machine learning technique enabling an agent to learn while interacting with its environment. Agents in non-stationary environments are faced with the additional problem of handling concept drift, which is a partially-observable change that modifies the environment without notification. This causes several problems: agents with a decaying exploration fail to adapt while agents capable of adapting may over fit to noise and overwrites previously learned knowledge. These issues are known as the plasticity-stability dilemma and catastrophic forgetting, respectively. Agents in such environments must take steps to mitigate both problems. This work contributes an algorithm that combines a concept drift classifier with multi-objective reinforcement learning (MORL) to produce an unsupervised technique for learning in non-stationary environments, especially in the face of partially observable changes. The algorithm manages the plasticity-stability dilemma by strategically adjusting learning rates and mitigates catastrophic forgetting by systematically storing knowledge and recalling it when it recognizes repeat situations. Results demonstrate that agents using this algorithm outperform agents using an approach that ignores non-stationarity.