Semantic Visual Navigation by Watching YouTube Videos

Oct-9-2024, 20:58:55 GMT–Neural Information Processing Systems

Semantic cues and statistical regularities in real-world environment layouts can improve efficiency for navigation in novel environments. This paper learns and leverages such semantic cues for navigating to objects of interest in novel environments, by simply watching YouTube videos. This is challenging because YouTube videos don't come with labels for actions or goals, and may not even showcase optimal behavior. Our method tackles these challenges through the use of Q-learning on pseudo-labeled transition quadruples (image, action, next image, reward). We show that such off-policy Q-learning from passive data is able to learn meaningful semantic cues for navigation.

novel environment, semantic visual navigation, youtube video, (3 more...)

Neural Information Processing Systems

Oct-9-2024, 20:58:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence > Machine Learning (1.00)