Video Prediction Models as Rewards for Reinforcement Learning
–Neural Information Processing Systems
Specifying reward signals that allow agents to learn complex behaviors is a long-standing challenge in reinforcement learning.A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on the internet.
Neural Information Processing Systems
Dec-26-2025, 22:19:24 GMT
- Technology: