ReplacingRewardswithExamples: Example-Based PolicySearchviaRecursiveClassification

Neural Information Processing Systems 

Forexample,theusercouldsolvethe task using actions unavailable to the agent (e.g., the user may have two arms, but a robotic agent mayhaveonlyone) ortheusercould findsuccess examples bysearching theinternet.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found