AITopics | question-only adversary

Neural Information Processing Systems http://nips.cc/

question-only adversary, question-only model, vqa model, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)

Add feedback

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Sainandan Ramakrishnan, Aishwarya Agrawal, Stefan Lee

Neural Information Processing SystemsNov-20-2025, 17:03:01 GMT

Modern Visual Question Answering (VQA) models have been shown to rely heavily on superficial correlations between question and answer words learned during training - e.g .

machine learning, natural language, question answering, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)

Add feedback

Reviews: Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Neural Information Processing SystemsOct-7-2024, 11:47:50 GMT

This paper studies the problem of handling the langauge/text pariors in the task visual question answering. The great performance achieved by many state-of-the-art VQA systems are accomplished by heavily learning a better question encoding to better capture the correlations between the questions and answers, but ignore the image information. So the problem is important to the VQA research community. In general, the paper is well-written and easy to follow. And some concerns and sugggestions can be found as the following: 1) The major concern is the basic intuition of the question-only adversary: The question encoding q_i from the question encoder is not necessarily the same bias that lead the VQA model f to ignore the visual content. Since f can be a deep neutral network, for example, deep RNN or deep RNN-CNN to leverage both the question embedding and visual embedding, thus the non-linearity in f would make the question embedding as a image-aware represention to generate the answer distribution.

adversarial regularization, information, question-only model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)

Add feedback

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Ramakrishnan, Sainandan, Agrawal, Aishwarya, Lee, Stefan

Neural Information Processing SystemsDec-31-2018

Modern Visual Question Answering (VQA) models have been shown to rely heavily on superficial correlations between question and answer words learned during training -- \eg overwhelmingly reporting the type of room as kitchen or the sport being played as tennis, irrespective of the image. Most alarmingly, this shortcoming is often not well reflected during evaluation because the same strong priors exist in test distributions; however, a VQA system that fails to ground questions in image content would likely perform poorly in real-world settings. In this work, we present a novel regularization scheme for VQA that reduces this effect. We introduce a question-only model that takes as input the question encoding from the VQA model and must leverage language biases in order to succeed. We then pose training as an adversarial game between the VQA model and this question-only adversary -- discouraging the VQA model from capturing language biases in its question encoding.Further, we leverage this question-only model to estimate the mutual information between the image and answer given the question, which we maximize explicitly to encourage visual grounding. Our approach is a model agnostic training procedure and simple to implement. We show empirically that it can improve performance significantly on a bias-sensitive split of the VQA dataset for multiple base models -- achieving state-of-the-art on this task. Further, on standard VQA tasks, our approach shows significantly less drop in accuracy compared to existing bias-reducing VQA models.

machine learning, natural language, question answering, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)

Add feedback

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Ramakrishnan, Sainandan, Agrawal, Aishwarya, Lee, Stefan

Neural Information Processing SystemsDec-31-2018

Modern Visual Question Answering (VQA) models have been shown to rely heavily on superficial correlations between question and answer words learned during training -- \eg overwhelmingly reporting the type of room as kitchen or the sport being played as tennis, irrespective of the image. Most alarmingly, this shortcoming is often not well reflected during evaluation because the same strong priors exist in test distributions; however, a VQA system that fails to ground questions in image content would likely perform poorly in real-world settings. In this work, we present a novel regularization scheme for VQA that reduces this effect. We introduce a question-only model that takes as input the question encoding from the VQA model and must leverage language biases in order to succeed. We then pose training as an adversarial game between the VQA model and this question-only adversary -- discouraging the VQA model from capturing language biases in its question encoding.Further, we leverage this question-only model to estimate the mutual information between the image and answer given the question, which we maximize explicitly to encourage visual grounding. Our approach is a model agnostic training procedure and simple to implement. We show empirically that it can improve performance significantly on a bias-sensitive split of the VQA dataset for multiple base models -- achieving state-of-the-art on this task. Further, on standard VQA tasks, our approach shows significantly less drop in accuracy compared to existing bias-reducing VQA models.

machine learning, natural language, question answering, (18 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)

Add feedback

Filters

Collaborating Authors

question-only adversary

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Reviews: Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization