Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Sainandan Ramakrishnan, Aishwarya Agrawal, Stefan Lee

Neural Information Processing Systems 

Modern Visual Question Answering (VQA) models have been shown to rely heavily on superficial correlations between question and answer words learned during training - e.g .