Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

Artetxe, Mikel, Schwenk, Holger

arXiv.org Artificial Intelligence 

An increasingly popular approach to alleviate this issue is to first learn general language representations on unlabeled data, which are then integrated in task-specific downstream systems. This approach was first popularized by word embeddings (Mikolov et al., 2013b; This work was performed during an internship at Facebook AI Research. Pennington et al., 2014), but has recently been superseded by sentence-level representations (Peters et al., 2018; Devlin et al., 2019). Nevertheless, all these works learn a separate model for each language and are thus unable to leverage information across different languages, greatly limiting their potential performance for low-resource languages. In this work, we are interested in universal language agnostic sentence embeddings, that is, vector representations of sentences that are general with respect to two dimensions: the input language and the NLP task.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found