Deconstructing word embedding algorithms
Kenyon-Dean, Kian, Newell, Edward, Cheung, Jackie Chi Kit
–arXiv.org Artificial Intelligence
In general topology, an embedding is understood as an injective structure preserving The advent of efficient uncontextualized word embedding map, f: X Y, between two mathematical structures algorithms (e.g., Word2vec (Mikolov et al., X and Y. A word embedding algorithm (f) 2013) and GloVe (Pennington et al., 2014)) marked learns an inner-product space (Y) to preserve a linguistic a historical breakthrough in NLP. Countless researchers structure within a reference corpus of text, employed word embeddings in new models D (X), based on a vocabulary, V. The structure in to improve results on a multitude of NLP problems. D is analyzed in terms of the relationships between In this work, we provide a retrospective analysis words induced by their co-appearances, according of these groundbreaking models of the past, to a certain definition of context. In such an analysis, which simultaneously offers theoretical insights for each word figures dually: (1) as a focal element how future models can be developed and understood.
arXiv.org Artificial Intelligence
Nov-12-2020