RUBi: Reducing Unimodal Biases for Visual Question Answering

Cadene, Remi, Dancette, Corentin, younes, Hedi Ben, Cord, Matthieu, Parikh, Devi

Mar-18-2020, 20:45:33 GMT–Neural Information Processing Systems

Visual Question Answering (VQA) is the task of answering questions about an image. Some VQA models often exploit unimodal biases to provide the correct answer without using the image information. As a result, they suffer from a huge drop in performance when evaluated on data outside their training set distribution. This critical issue makes them unsuitable for real-world settings. We propose RUBi, a new learning strategy to reduce biases in any VQA model.

rubi, unimodal bias, vqa model, (1 more...)

Neural Information Processing Systems

Mar-18-2020, 20:45:33 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Question Answering (0.64)
  - Machine Learning (0.46)