Hypothesis Network Planned Exploration for Rapid Meta-Reinforcement Learning Adaptation

Jacobson, Maxwell Joseph, Xue, Yexiang

arXiv.org Artificial Intelligence 

Meta Reinforcement Learning (Meta RL) trains agents that adapt to fast-changing environments and tasks. Current strategies often lose adaption efficiency due to the passive nature of model exploration, causing delayed understanding of new transition dynamics. This results in particularly fast-evolving tasks being impossible to solve. We propose a novel approach, Hypothesis Network Planned Exploration (HyPE), that integrates an active and planned exploration process via the hypothesis network to optimize adaptation speed. HyPE uses a generative hypothesis network to form potential models of state transition dynamics, then eliminates incorrect models through strategically devised experiments. Evaluated on a symbolic version of the Alchemy game, HyPE outpaces baseline methods in adaptation speed and model accuracy, validating its potential in enhancing reinforcement learning adaptation in rapidly evolving settings.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found