Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning