A non-NLP application of Word2Vec – Towards Data Science – Medium
The above is exactly what Word2Vec seeks to do: it tries to determine the meaning of a word by analyzing its neighboring words (also called context). The algorithm exists in two flavors CBOW and Skip-Gram. Given a set of sentences (also called corpus) the model loops on the words of each sentence and either tries to use the current word of to predict its neighbors (its context), in which case the method is called "Skip-Gram", or it uses each of these contexts to predict the current word, in which case the method is called "Continuous Bag Of Words" (CBOW). The limit on the number of words in each context is determined by a parameter called "window size". So if we choose for example the Skip-Gram method, Word2Vec then consists of using a shallow neural network, i.e. a neural network of only one hidden layer, to learn the word embedding. The network first initializes randomly its weights then iteratively adapt these during training to minimize the error it makes when using words to predict their contexts.
Aug-1-2017, 03:24:15 GMT
- Technology: