A survey on VQA_Datasets and Approaches

May-2-2021–arXiv.org Artificial Intelligence

Visual question answering (VQA) is a task that combines both the techniques of computer vision and natural language processing. It requires models to answer a text-based question according to the information contained in a visual. In recent years, the research field of VQA has been expanded. Research that focuses on the VQA, examining the reasoning ability and VQA on scientific diagrams, has also been explored more. Meanwhile, more multimodal feature fusion mechanisms have been proposed. This paper will review and analyze existing datasets, metrics, and models proposed for the VQA task.

dataset, information, visual question, (14 more...)

arXiv.org Artificial Intelligence

May-2-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Iowa > Johnson County > Iowa City (0.04)
- Asia > China
  - Shaanxi Province > Xi'an (0.04)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Machine Learning > Neural Networks (0.68)
  - Natural Language > Question Answering (0.55)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found