A Call for More Rigor in Unsupervised Cross-lingual Learning

Artetxe, Mikel, Ruder, Sebastian, Yogatama, Dani, Labaka, Gorka, Agirre, Eneko

arXiv.org Machine Learning 

In work implicitly includes monolingual and natural language processing, the main promise of cross-lingual signals that constitute a departure multilingual learning is to bridge the digital language from the pure setting. We review existing training divide, to enable access to information and signals as well as other signals that may be technology for the world's 6,900 languages (Ruder of interest for future study (§4). We then discuss et al., 2019). For the purpose of this paper, we methodological issues in UCL (e.g., validation, hyperparameter define "multilingual learning" as learning a common tuning) and propose best evaluation model for two or more languages from raw practices (§5). Finally, we provide a unified outlook text, without any downstream task labels. Common of established research areas (cross-lingual use cases include translation as well as pretraining word embeddings, deep multilingual models and multilingual representations. We will use the term unsupervised machine translation) in UCL (§6), interchangeably with "cross-lingual learning".