Overcoming Language Priors in Visual Question Answering via Distinguishing Superficially Similar Instances

Open in new window