Grounded Reinforcement Learning: Learning to Win the Game under Human Commands

Neural Information Processing Systems 

From the RL perspective, it is extremely challenging to derive a precise reward function for human preferences since the commands are abstract and the valid behaviors are highly complicated and multi-modal.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found