Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning
Li, Lanqing, Zhang, Hai, Zhang, Xinyu, Zhu, Shatong, Zhao, Junqiao, Heng, Pheng-Ann
–arXiv.org Artificial Intelligence
As a marriage between offline RL and meta-RL, the advent of offline meta-reinforcement learning (OMRL) has shown great promise in enabling RL agents to multi-task and quickly adapt while acquiring knowledge safely. Among which, Context-based OMRL (COMRL) as a popular paradigm, aims to learn a universal policy conditioned on effective task representations. In this work, by examining several key milestones in the field of COMRL, we propose to integrate these seemingly independent methodologies into a unified information theoretic framework. Most importantly, we show that the pre-existing COMRL algorithms are essentially optimizing the same mutual information objective between the task variable $\boldsymbol{M}$ and its latent representation $\boldsymbol{Z}$ by implementing various approximate bounds. Based on the theoretical insight and the information bottleneck principle, we arrive at a novel algorithm dubbed UNICORN, which exhibits remarkable generalization across a broad spectrum of RL benchmarks, context shift scenarios, data qualities and deep learning architectures, attaining the new state-of-the-art. We believe that our framework could open up avenues for new optimality bounds and COMRL algorithms.
arXiv.org Artificial Intelligence
Feb-4-2024
- Country:
- North America
- Canada (0.14)
- United States > New York (0.14)
- North America
- Genre:
- Research Report (0.82)
- Technology: