Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations
–Neural Information Processing Systems
This is particularly challenging for high-dimensional control tasks, in which there may be a large number of factors that influence the agent's objective.
Neural Information Processing Systems
Feb-7-2026, 11:46:33 GMT