RUBi: Reducing Unimodal Biases for Visual Question Answering
Cadene, Remi, Dancette, Corentin, younes, Hedi Ben, Cord, Matthieu, Parikh, Devi
–Neural Information Processing Systems
Visual Question Answering (VQA) is the task of answering questions about an image. Some VQA models often exploit unimodal biases to provide the correct answer without using the image information. As a result, they suffer from a huge drop in performance when evaluated on data outside their training set distribution. This critical issue makes them unsuitable for real-world settings. We propose RUBi, a new learning strategy to reduce biases in any VQA model.
Neural Information Processing Systems
Mar-18-2020, 20:45:33 GMT
- Technology: