New techniques in the field of Visual Question Answering part2(Machine Learning)

Jan-20-2023, 20:20:44 GMT–#artificialintelligence

Abstract: We present a new pre-training method, Multimodal Inverse Cloze Task, for Knowledge-based Visual Question Answering about named Entities (KVQAE). KVQAE is a recently introduced task that consists in answering questions about named entities grounded in a visual context using a Knowledge Base. Therefore, the interaction between the modalities is paramount to retrieve information and must be captured with complex fusion models. As these models require a lot of training data, we design this pre-training task from existing work in textual Question Answering. It consists in considering a sentence as a pseudo-question and its context as a pseudo-relevant passage and is extended by considering images near texts in multimodal documents.

machine learning, natural language, question answering, (11 more...)

#artificialintelligence

Jan-20-2023, 20:20:44 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning > Expert Systems (0.59)
  - Natural Language
    - Question Answering (0.89)
    - Text Processing (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found