Does computer vision matter for action?

Zhou, Brady, Krähenbühl, Philipp, Koltun, Vladlen

May-30-2019–arXiv.org Artificial Intelligence

Computer vision produces representations of scene content. Much computer vision research is predicated on the assumption that these intermediate representations are useful for action. Recent work at the intersection of machine learning and robotics calls this assumption into question by training sensorimotor systems directly for the task at hand, from pixels to actions, with no explicit intermediate representations. Thus the central question of our work: Does computer vision matter for action? We probe this question and its offshoots via immersive simulation, which allows us to conduct controlled reproducible experiments at scale. We instrument immersive three-dimensional environments to simulate challenges such as urban driving, off-road trail traversal, and battle. Our main finding is that computer vision does matter. Models equipped with intermediate representations train faster, achieve higher task performance, and generalize better to previously unseen environments. A video that summarizes the work and illustrates the results can be found at https://youtu.be/4MfWa2yZ0Jc

agent, artificial intelligence, representation, (16 more...)

arXiv.org Artificial Intelligence

May-30-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report > Experimental Study (0.66)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.95)

Technology:
- Information Technology > Artificial Intelligence > Vision (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found