Self-Paced Deep Reinforcement Learning

Neural Information Processing Systems 

In contrast, we propose to generate the curriculum based on a principled inference view on RL. Our approach generates the curriculum based on two quantities: The value function of the agent and the KL divergence to a target distribution of tasks.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found