AITopics | Question Answering

Collaborating Authors

Question Answering

"Questions are asked and answered every day. Question answering (QA) technology aims to deliver the same facility online. It goes further than the more familiar search based on keywords (as in Google, Yahoo, and other search engines), in attempting to recognize what a question expresses and to respond with an actual answer. This simplifies things for users in two ways. First, questions do not often translate into a simple list of keywords. ...Second, QA takes responsibility for providing answers, rather than a searchable list of links to potentially relevant documents (web pages), highlighted by snippets of text that show how the query matched the documents."
– from Bonnie Webber & Nick Webb. Question Answering. In The Handbook of Computational Linguistics and Natural Language Processing. Alexander Clark, Chris Fox, Shalom Lappin (Eds.). Wiley, 2010.

News Overviews Instructional Materials AI-Alerts Classics

Learning Conditioned Graph Structures for Interpretable Visual Question Answering

Norcliffe-Brown, Will, Vafeias, Stathis, Parisot, Sarah

Neural Information Processing SystemsFeb-14-2020, 20:13:10 GMT

Visual Question answering is a challenging problem requiring a combination of concepts from Computer Vision and Natural Language Processing. Most existing approaches use a two streams strategy, computing image and question features that are consequently merged using a variety of techniques. Nonetheless, very few rely on higher level image representations, which can capture semantic and spatial relationships. In this paper, we propose a novel graph-based approach for Visual Question Answering. Our method combines a graph learner module, which learns a question specific graph representation of the input image, with the recent concept of graph convolutions, aiming to learn image representations that capture question specific interactions.

graph learner module, learning conditioned graph structure, representation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.91)

Add feedback

Learning to Specialize with Knowledge Distillation for Visual Question Answering

Mun, Jonghwan, Lee, Kimin, Shin, Jinwoo, Han, Bohyung

Neural Information Processing SystemsFeb-14-2020, 20:11:49 GMT

Visual Question Answering (VQA) is a notoriously challenging problem because it involves various heterogeneous tasks defined by questions within a unified framework. Learning specialized models for individual types of tasks is intuitively attracting but surprisingly difficult; it is not straightforward to outperform naive independent ensemble approach. We present a principled algorithm to learn specialized models with knowledge distillation under a multiple choice learning (MCL) framework, where training examples are assigned dynamically to a subset of models for updating network parameters. The assigned and non-assigned models are learned to predict ground-truth answers and imitate their own base models before specialization, respectively. Our approach alleviates the limitation of data deficiency in existing MCL frameworks, and allows each model to learn its own specialized expertise without forgetting general knowledge.

image classification, knowledge distillation, learning

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)

Add feedback

Visual Question Answering with Question Representation Update (QRU)

Li, Ruiyu, Jia, Jiaya

Neural Information Processing SystemsFeb-14-2020, 16:26:17 GMT

Our method aims at reasoning over natural language questions and visual images. Given a natural language question about an image, our model updates the question representation iteratively by selecting image regions relevant to the query and learns to give the correct answer. Our model contains several reasoning layers, exploiting complex visual relations in the visual question answering (VQA) task. The proposed network is end-to-end trainable through back-propagation, where its weights are initialized using pre-trained convolutional neural network (CNN) and gated recurrent unit (GRU). Our method is evaluated on challenging datasets of COCO-QA and VQA and yields state-of-the-art performance.

natural language question, qru, question representation update

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)

Add feedback

High-Order Attention Models for Visual Question Answering

Schwartz, Idan, Schwing, Alexander, Hazan, Tamir

Neural Information Processing SystemsFeb-14-2020, 13:41:59 GMT

The quest for algorithms that enable cognitive abilities is an important part of machine learning. A common trait in many recently investigated cognitive-like tasks is that they take into account different data modalities, such as visual and textual input. In this paper we propose a novel and generally applicable form of attention mechanism that learns high-order correlations between various data modalities. We show that high-order correlations effectively direct the appropriate attention to the relevant elements in the different data modalities that are required to solve the joint task. We demonstrate the effectiveness of our high-order attention mechanism on the task of visual question answering (VQA), where we achieve state-of-the-art performance on the standard VQA dataset.

data modality, different data modality, high-order attention model, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.67)
Information Technology > Artificial Intelligence > Machine Learning (0.54)

Add feedback

Exploring Models and Data for Image Question Answering

Ren, Mengye, Kiros, Ryan, Zemel, Richard

Neural Information Processing SystemsFeb-14-2020, 12:44:00 GMT

This work aims to address the problem of image-based question-answering (QA) with new models and datasets. In our work, we propose to use neural networks and visual semantic embeddings, without intermediate stages such as object detection and image segmentation, to predict answers to simple questions about images. Our model performs 1.8 times better than the only published results on an existing image QA dataset. We also present a question generation algorithm that converts image descriptions, which are widely available, into QA form. We used this algorithm to produce an order-of-magnitude larger dataset, with more evenly distributed answers.

algorithm, dataset, exploring model and data

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base

Guo, Daya, Tang, Duyu, Duan, Nan, Zhou, Ming, Yin, Jian

Neural Information Processing SystemsFeb-14-2020, 11:56:36 GMT

We present an approach to map utterances in conversation to logical forms, which will be executed on a large-scale knowledge base. To handle enormous ellipsis phenomena in conversation, we introduce dialog memory management to manipulate historical entities, predicates, and logical forms when inferring the logical form of current utterances. Dialog memory management is embodied in a generative model, in which a logical form is interpreted in a top-down manner following a small and flexible grammar. We learn the model from denotations without explicit annotation of logical forms, and evaluate it on a large-scale dataset consisting of 200K dialogs over 12.8M entities. Results verify the benefits of modeling dialog memory, and show that our semantic parsing-based approach outperforms a memory network based encoder-decoder model by a huge margin.

dialog-to-action, large-scale knowledge base, logical form, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.66)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.66)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.40)

Add feedback

Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering

Narasimhan, Medhini, Lazebnik, Svetlana, Schwing, Alexander

Neural Information Processing SystemsFeb-14-2020, 11:12:19 GMT

Accurately answering a question about a given image requires combining observations with general knowledge. While this is effortless for humans, reasoning with general knowledge remains an algorithmic challenge. To advance research in this direction a novel fact-based' visual question answering (FVQA) task has been introduced recently along with a large set of curated facts which link two entities, i.e., two possible answers, via a relation. Given a question-image pair, deep network techniques have been employed to successively reduce the large set of facts until one of the two entities of the final remaining fact is predicted as the answer. We observe that a successive process which considers one fact at a time to form a local decision is sub-optimal.

general knowledge, graph convolution net, reasoning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.65)

Add feedback

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Ramakrishnan, Sainandan, Agrawal, Aishwarya, Lee, Stefan

Neural Information Processing SystemsFeb-14-2020, 08:13:50 GMT

Modern Visual Question Answering (VQA) models have been shown to rely heavily on superficial correlations between question and answer words learned during training -- \eg overwhelmingly reporting the type of room as kitchen or the sport being played as tennis, irrespective of the image. Most alarmingly, this shortcoming is often not well reflected during evaluation because the same strong priors exist in test distributions; however, a VQA system that fails to ground questions in image content would likely perform poorly in real-world settings. In this work, we present a novel regularization scheme for VQA that reduces this effect. We introduce a question-only model that takes as input the question encoding from the VQA model and must leverage language biases in order to succeed. We then pose training as an adversarial game between the VQA model and this question-only adversary -- discouraging the VQA model from capturing language biases in its question encoding.Further, we leverage this question-only model to estimate the mutual information between the image and answer given the question, which we maximize explicitly to encourage visual grounding.

adversarial regularization, question-only model, vqa model, (1 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment (0.61)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.64)

Add feedback

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

Malinowski, Mateusz, Fritz, Mario

Neural Information Processing SystemsFeb-14-2020, 08:13:36 GMT

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision. We combine discrete reasoning with uncertain predictions by a multi-world approach that represents uncertainty about the perceived world in a bayesian framework. Our approach can handle human questions of high complexity about realistic scenes and replies with range of answer like counts, object classes, instances and lists of them. The system is directly trained from question-answer pairs. We establish a first benchmark for this task that can be seen as a modern attempt at a visual turing test.

multi-world approach, real-world scene, uncertain input

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.40)

Add feedback

Hierarchical Question-Image Co-Attention for Visual Question Answering

Lu, Jiasen, Yang, Jianwei, Batra, Dhruv, Parikh, Devi

Neural Information Processing SystemsFeb-14-2020, 05:26:31 GMT

A number of recent works have proposed attention models for Visual Question Answering (VQA) that generate spatial maps highlighting image regions relevant to answering the question. In this paper, we argue that in addition to modeling "where to look" or visual attention, it is equally important to model "what words to listen to" or question attention. We present a novel co-attention model for VQA that jointly reasons about image and question attention. Our model improves the state-of-the-art on the VQA dataset from 60.3% to 60.5%, and from 61.6% to 63.3% on the COCO-QA dataset. By using ResNet, the performance is further improved to 62.1% for VQA and 65.4% for COCO-QA.

dataset, hierarchical question-image co-attention, question attention

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.66)
Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback