Diversify & Conquer: Outcome-directed Curriculum RL via Out-of-Distribution Disagreement

Neural Information Processing Systems 

D2C requires only a few examples of desired outcomes and works in any environment, regardless of its geometry or the distribution of the desired outcome examples.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found