A Broader Impact such shortcomings by improving the model's grounding on the vision and instruction input, and

Neural Information Processing Systems 

Towards vqa models that can read.