Robot Policy Learning with Temporal Optimal Transport Reward

Neural Information Processing Systems 

Reward specification is one of the most tricky problems in Reinforcement Learning, which usually requires tedious hand engineering in practice.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found