Submission 180: Author Response

Neural Information Processing Systems 

We thank the reviewers for their thoughtful comments. Reviewers have described our work as "extremely important in that it provides a reality check for Reviewers' comments have been paraphrased for brevity. R3: It looks like the random image regularizer hurts in-domain performance. R3: Do other VQA datasets (e.g., GQA, VCR) have the same problem? R2: Do other datasets for OOD evaluation have similar problems like VQA-CP?