Review for NeurIPS paper: Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data
–Neural Information Processing Systems
Weaknesses: The main problem with the paper is the game design. In visual dialogue, i.e GuessWhich game[2], does not have access to the image. It has to build up the visual representation based on the caption and dialogue. That is why having a caption is important for the GuessWhich game (L69). While in the proposed game, since Q-Bot has constant access to the images. It just needs to ask questions such that it distinguished the one image from the other.
Neural Information Processing Systems
Feb-7-2025, 14:42:38 GMT
- Technology: