Reviews: Playing hard exploration games by watching YouTube

Oct-7-2024, 08:16:25 GMT–Neural Information Processing Systems

While usually supervision has to be intentionally provided by a human, the authors instead use YouTube videos as a form of supervision. They first align videos to a shared representation, using two concept that do not require any labeling: i) predicting the temporal distance between frames in the same video, and ii) predicting the temporal distance between a video and audio frame of the same video. Subsequently, they use the embedding on a novel video to generate checkpoints, which serve a intermediate rewards for an RL agent. Experiments show state-of-the-art performance amongst learning from demonstration approaches. Strength: - The paper is clearly written, with logical steps between the sections, and good motivations.

demonstration, supervision, video, (12 more...)

Neural Information Processing Systems

Oct-7-2024, 08:16:25 GMT

Conferences Web Page

Add feedback

Industry:
- Leisure & Entertainment > Games (0.54)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence > Machine Learning (0.78)