Language Representation in Multilingual BERT and its applications to improve Cross-lingual Generalization