Hybrid Policy Optimization from Imperfect Demonstrations Hanlin Y ang Sun Y at-sen University Chao Y u

Neural Information Processing Systems 

Exploration is one of the main challenges in Reinforcement Learning (RL), especially in environments with sparse rewards.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found