Unsupervised Alignment of Distributional Word Embeddings

Diallo, Aissatou, Fürnkranz, Johannes

arXiv.org Artificial Intelligence 

Cross-domain alignment plays a key role in tasks ranging from image-text retrieval to machine translation. The main objective is to associate related entities across different domains. Recently, purely unsupervised methods operating on monolingual embeddings have successfully been used to infer a bilingual lexicon without relying on supervision. However, current state-of-the art methods only focus on point vectors although distributional embeddings have proven to embed richer semantic information when representing words. This paper investigates a novel stochastic optimization approach for aligning word distributional embeddings. Our method builds upon techniques in optimal transport to resolve the cross-domain matching problem in a principled manner. We evaluate our method on the problem of unsupervised word translation, by aligning word embeddings trained on monolingual data. We present empirical evidence to demonstrate the validity of our approach to the bilingual lexicon induction task across several language pairs.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found