AITopics | self-critical reasoning

Self-Critical Reasoning for Robust Visual Question Answering

Neural Information Processing SystemsDec-25-2025, 05:22:47 GMT

Visual Question Answering (VQA) deep-learning systems tend to capture superficial statistical correlations in the training data because of strong language priors and fail to generalize to test data with a significantly different question-answer (QA) distribution. To address this issue, we introduce a self-critical training objective that ensures that visual explanations of correct answers match the most influential image regions more than other competitive answer candidates. The influential regions are either determined from human visual/textual explanations or automatically from just significant words in the question and answer. We evaluate our approach on the VQA generalization task using the VQA-CP dataset, achieving a new state-of-the-art i.e. 49.5\% using textual explanations and 48.5\% using automatically

electronic proceedings, name change, self-critical reasoning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Reviews: Self-Critical Reasoning for Robust Visual Question Answering

Neural Information Processing SystemsJan-22-2025, 18:56:48 GMT

Originality: The proposed work is inspired from an existing work – HINT (Selvaraju et al., arXiv 2019) which also proposes a novel training objective to align gradient based model's importance for various object proposals in the image with the regions identified as important by humans. This paper improves upon HINT by – 1) instead of training the model to align its gradient based importance with regions identified as important by humans, the paper trains the model to strengthen its importance for the most influential region -- proposal deemed as most important as per the model's gradients based importance among the set of regions identified as most important by humans, 2) in addition to using visual regions identified as important by humans, the paper also introduces using textual explanations provided by humans and training QA pairs to identify important image regions, 2) the paper proposes another term in the objective that criticizes incorrect predicted answers being more sensitive to the influential region than correct answers. Quality: The paper does a good job of evaluating the proposed approach on both the VQA-CP and VQA datasets. The evaluation of the ablations of the proposed approach and false sensitivity rate are also useful. Clarity: The paper is clear for the most part except the following – 1. Currently, in order to understand how the gradients from the proposed training objectives are effecting the model's parameters, one needs to read the HINT paper.

influential region, self-critical reasoning, training objective, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.74)

Add feedback

Reviews: Self-Critical Reasoning for Robust Visual Question Answering

Neural Information Processing SystemsJan-22-2025, 18:56:38 GMT

All reviewers recommended the submission for acceptance. Reviewers found the author response to be insightful and helped clarify many of their concerns. The approach itself introduces an interesting take on bias reduction for VQA that proves effective across a range of experimental settings.

self-critical reasoning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.40)

Add feedback

Self-Critical Reasoning for Robust Visual Question Answering

Neural Information Processing SystemsOct-9-2024, 19:19:16 GMT

Visual Question Answering (VQA) deep-learning systems tend to capture superficial statistical correlations in the training data because of strong language priors and fail to generalize to test data with a significantly different question-answer (QA) distribution. To address this issue, we introduce a self-critical training objective that ensures that visual explanations of correct answers match the most influential image regions more than other competitive answer candidates. The influential regions are either determined from human visual/textual explanations or automatically from just significant words in the question and answer. We evaluate our approach on the VQA generalization task using the VQA-CP dataset, achieving a new state-of-the-art i.e. 49.5\% using textual explanations and 48.5\% using automatically

self-critical reasoning, textual explanation

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.67)

Add feedback

Self-Critical Reasoning for Robust Visual Question Answering

Wu, Jialin, Mooney, Raymond

Neural Information Processing SystemsMar-19-2020, 00:03:24 GMT

Visual Question Answering (VQA) deep-learning systems tend to capture superficial statistical correlations in the training data because of strong language priors and fail to generalize to test data with a significantly different question-answer (QA) distribution. To address this issue, we introduce a self-critical training objective that ensures that visual explanations of correct answers match the most influential image regions more than other competitive answer candidates. The influential regions are either determined from human visual/textual explanations or automatically from just significant words in the question and answer. We evaluate our approach on the VQA generalization task using the VQA-CP dataset, achieving a new state-of-the-art i.e. 49.5\% using textual explanations and 48.5\% using automatically Papers published at the Neural Information Processing Systems Conference.

self-critical reasoning, textual explanation

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.67)

Add feedback

Filters

Collaborating Authors

self-critical reasoning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Self-Critical Reasoning for Robust Visual Question Answering

Reviews: Self-Critical Reasoning for Robust Visual Question Answering

Reviews: Self-Critical Reasoning for Robust Visual Question Answering

Self-Critical Reasoning for Robust Visual Question Answering

Self-Critical Reasoning for Robust Visual Question Answering