Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback Marcel T orne 1,2 Max Balsells 3 Zihan Wang

Neural Information Processing Systems 

This procedure can leverage noisy, asynchronous human feedback to learn policies with no hand-crafted reward design or exploration bonuses.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found