Robust Navigation with Cross-Modal Fusion and Knowledge Transfer