Evaluating Visual Reasoning through Grounded Language Understanding