novelty-seeking agent
Reviews: Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Two heuristic mechanisms from neuroevolution study have been imported into the recently proposed evolution strategy for deep reinforcement learning. One is Novelty Search (NS), which aims to bias the search to have more exploration. It try to explore previously unvisited areas in the space of behavior, not in the space of policy parameters. The other is to maintain multiple populations in a single run. The authors proposed three variation of the evolution strategy combining these mechanisms.
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Conti, Edoardo, Madhavan, Vashisht, Such, Felipe Petroski, Lehman, Joel, Stanley, Kenneth, Clune, Jeff
Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. However, many RL problems require directed exploration because they have reward functions that are sparse or deceptive (i.e. Here we show that algorithms that have been invented to promote directed exploration in small-scale evolved neural networks via populations of exploring agents, specifically novelty search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to improve its performance on sparse or deceptive deep RL tasks, while retaining scalability. Our experiments confirm that the resultant new algorithms, NS-ES and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES to achieve higher performance on Atari and simulated robots learning to walk around a deceptive trap. This paper thus introduces a family of fast, scalable algorithms for reinforcement learning that are capable of directed exploration.