Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations

Neural Information Processing Systems 

This is particularly challenging for high-dimensional control tasks, in which there may be a large number of factors that influence the agent's objective.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found