Goto

Collaborating Authors

 question representation update


Visual Question Answering with Question Representation Update (QRU)

Neural Information Processing Systems

Our method aims at reasoning over natural language questions and visual images. Given a natural language question about an image, our model updates the question representation iteratively by selecting image regions relevant to the query and learns to give the correct answer. Our model contains several reasoning layers, exploiting complex visual relations in the visual question answering (VQA) task. The proposed network is end-to-end trainable through back-propagation, where its weights are initialized using pre-trained convolutional neural network (CNN) and gated recurrent unit (GRU). Our method is evaluated on challenging datasets of COCO-QA and VQA and yields state-of-the-art performance.



Reviews: Visual Question Answering with Question Representation Update (QRU)

Neural Information Processing Systems

Strength: The technical contributions are a clever and simple extension/combination of existing ideas such as "Neural Reasoner" [B. Show, attend and tell: 307 Neural image caption generation with visual attention. The paper is well-written and easy to follow, especially the architecture of the model and the explanations for it are modular and simple (image understanding layer, question encoding layer, reasoning layer, and answering layer). Haven't yet encountered a VQA system that changes the question representation based on image. This novelty adds strength to this paper.


Visual Question Answering with Question Representation Update (QRU)

Neural Information Processing Systems

Our method aims at reasoning over natural language questions and visual images. Given a natural language question about an image, our model updates the question representation iteratively by selecting image regions relevant to the query and learns to give the correct answer. Our model contains several reasoning layers, exploiting complex visual relations in the visual question answering (VQA) task. The proposed network is end-to-end trainable through back-propagation, where its weights are initialized using pre-trained convolutional neural network (CNN) and gated recurrent unit (GRU). Our method is evaluated on challenging datasets of COCO-QA [19] and VQA [2] and yields state-of-the-art performance.


Visual Question Answering with Question Representation Update (QRU)

Li, Ruiyu, Jia, Jiaya

Neural Information Processing Systems

Our method aims at reasoning over natural language questions and visual images. Given a natural language question about an image, our model updates the question representation iteratively by selecting image regions relevant to the query and learns to give the correct answer. Our model contains several reasoning layers, exploiting complex visual relations in the visual question answering (VQA) task. The proposed network is end-to-end trainable through back-propagation, where its weights are initialized using pre-trained convolutional neural network (CNN) and gated recurrent unit (GRU). Our method is evaluated on challenging datasets of COCO-QA and VQA and yields state-of-the-art performance.


Visual Question Answering with Question Representation Update (QRU)

Li, Ruiyu, Jia, Jiaya

Neural Information Processing Systems

Our method aims at reasoning over natural language questions and visual images. Given a natural language question about an image, our model updates the question representation iteratively by selecting image regions relevant to the query and learns to give the correct answer. Our model contains several reasoning layers, exploiting complex visual relations in the visual question answering (VQA) task. The proposed network is end-to-end trainable through back-propagation, where its weights are initialized using pre-trained convolutional neural network (CNN) and gated recurrent unit (GRU). Our method is evaluated on challenging datasets of COCO-QA and VQA and yields state-of-the-art performance.