Distributed representation of multi-sense words: A loss-driven approach

Apr-14-2019–arXiv.org Artificial Intelligence

Word2Vec's Skip Gram model is the current state-of-the-art approach for estimating the distributed representation of words. However, it assumes a single vector per word, which is not well-suited for representing words that have multiple senses. This work presents LDMI, a new model for estimating distributional representations of words. LDMI relies on the idea that, if a word carries multiple senses, then having a different representation for each of its senses should lead to a lower loss associated with predicting its co-occurring words, as opposed to the case when a single vector representation is used for all the senses. After identifying the multi-sense words, LDMI clusters the occurrences of these words to assign a sense to each occurrence. Experiments on the contextual word similarity task show that LDMI leads to better performance than competing approaches.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Apr-14-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)

Genre:
- Research Report > Promising Solution (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (0.94)
  - Representation & Reasoning (0.93)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found