Learning Semantic Similarity

Kandola, Jaz, Cristianini, Nello, Shawe-taylor, John S.

Neural Information Processing Systems 

The standard representation of text documents as bags of words suffers from well known limitations, mostly due to its inability to exploit semantic similarity between terms. Attempts to incorporate somenotion of term similarity include latent semantic indexing [8], the use of semantic networks [9], and probabilistic methods [5]. In this paper we propose two methods for inferring such similarity froma corpus. The first one defines word-similarity based on document-similarity and viceversa, giving rise to a system of equations whose equilibrium point we use to obtain a semantic similarity measure. The second method models semantic relations by means of a diffusion process on a graph defined by lexicon and co-occurrence information.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found