Towards General Purpose Vision Systems

Gupta, Tanmay, Kamath, Amita, Kembhavi, Aniruddha, Hoiem, Derek

Apr-1-2021–arXiv.org Artificial Intelligence

A special purpose learning system assumes knowledge of admissible tasks at design time. Adapting such a system to unforeseen tasks requires architecture manipulation such as adding an output head for each new task or dataset. In this work, we propose a task-agnostic vision-language system that accepts an image and a natural language task description and outputs bounding boxes, confidences, and text. The system supports a wide range of vision tasks such as classification, localization, question answering, captioning, and more. We evaluate the system's ability to learn multiple skills simultaneously, to perform tasks with novel skill-concept combinations, and to learn new skills efficiently and without forgetting.

architecture, classification, gpv-i, (16 more...)

arXiv.org Artificial Intelligence

Apr-1-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States > Illinois (0.04)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Sports (0.46)
- Transportation > Ground
  - Road (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Machine Translation (0.67)
  - Vision > Image Understanding (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found