Hyper-Meta Reinforcement Learning with Sparse Reward

Hua, Yun, Wang, Xiangfeng, Jin, Bo, Li, Wenhao, Yan, Junchi, He, Xiaofeng, Zha, Hongyuan

Feb-11-2020–arXiv.org Artificial Intelligence

Despite their success, existing meta reinforcement learning methods still have difficulty in learning a meta policy effectively for RL problems with sparse reward. To this end, we develop a novel meta reinforcement learning framework, Hyper-Meta RL (HMRL), for sparse reward RL problems. It consists of meta state embedding, meta reward shaping and meta policy learning modules: The cross-environment meta state embedding module constructs a common meta state space to adapt to different environments; The meta state based environment-specific meta reward shaping effectively extends the original sparse reward trajectory by cross-environmental knowledge complementarity; As a consequence, the meta policy then achieves better generalization and efficiency with the shaped meta reward. Experiments with sparse reward show the superiority of HMRL on both transferability and policy learning efficiency.

meta reward, meta state, sparse reward, (14 more...)

arXiv.org Artificial Intelligence

Feb-11-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Asia > China
  - Shanghai > Shanghai (0.05)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found