On the Emergence of Linear Analogies in Word Embeddings

Neural Information Processing Systems 

Models such as Word2Vec and GloVe construct word embeddings based on the co-occurrence probability $P(i,j)$ of words $i$ and $j$ in text corpora.