Visual Reference Resolution using Attention Memory for Visual Dialog
–Neural Information Processing Systems
Visual dialog is a task of answering a series of inter-dependent questions given an input image, and often requires to resolve visual references among the questions. This problem is different from visual question answering (VQA), which relies on spatial attention ({\em a.k.a.
Neural Information Processing Systems
Nov-21-2025, 15:27:29 GMT
- Technology: