Robot Policy Learning with Temporal Optimal Transport Reward
–Neural Information Processing Systems
Reward specification is one of the most tricky problems in Reinforcement Learning, which usually requires tedious hand engineering in practice.
Neural Information Processing Systems
Feb-18-2026, 09:27:44 GMT
- Country:
- North America > Canada > Quebec > Montreal (0.04)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Education (0.46)
- Technology: