Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach

Jawanpuria, Pratik, Balgovind, Arjun, Kunchukuttan, Anoop, Mishra, Bamdev

arXiv.org Artificial Intelligence 

We propose a novel geometric approach for learning bilingual mappings given monolingual embeddings and a bilingual dictionary. Our approach decouples learning the transformation from the source language to the target language into (a) learning rotations for language-specific embeddings to align them to a common space, and (b) learning a similarity metric in the common space to model similarities between the embeddings. We model the bilingual mapping problem as an optimization problem on smooth Riemannian manifolds. We show that our approach outperforms previous approaches on the bilingual lexicon induction and cross-lingual word similarity tasks. We also generalize our framework to represent multiple languages in a common latent space. In particular, the latent space representations for several languages are learned jointly, given bilingual dictionaries for multiple language pairs. We illustrate the effectiveness of joint learning for multiple languages in zero-shot word translation setting.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found