RUBi: Reducing Unimodal Biases for Visual Question Answering