Rephrase, Augment, Reason: Visual Grounding of Questions for Vision-Language Models