Has anyone tried to mine all the types of analogies possible using word embeddings (word2vec)? • /r/MachineLearning


We know of a few types of word analogies, like "France capital Paris" and "US currency dollar", but has anyone tried to search for all the possible analogies that can be deducted by word2vec? They would have to find modifiers that have multiple matches, like "word1 modifier word2". An algorithm could be to cluster all the difference vectors (word1-word2, for all words) and select words that are close to the centers of dense clusters. Even if we don't find all modifiers, we can infer more by combining with ontologies/word net. If we find all the types of analogy we could make a large test dataset to benchmark how capable are the various word embeddings of representing analogy.

Duplicate Docs Excel Report

None found

Similar Docs  Excel Report  more

None found