Hybrid Policy Optimization from Imperfect Demonstrations Hanlin Y ang Sun Y at-sen University Chao Y u
–Neural Information Processing Systems
Exploration is one of the main challenges in Reinforcement Learning (RL), especially in environments with sparse rewards.
Neural Information Processing Systems
Oct-8-2025, 03:08:39 GMT