On the Evaluation of Vision-and-Language Navigation Instructions