Learning Syntactic Patterns for Automatic Hypernym Discovery
Snow, Rion, Jurafsky, Daniel, Ng, Andrew Y.
–Neural Information Processing Systems
Semantic taxonomies such as WordNet provide a rich source of knowledge for natural language processing applications, but are expensive to build, maintain, and extend. Motivated by the problem of automatically constructing and extending such taxonomies, in this paper we present a new algorithm for automatically learning hypernym (isa) relations from text. Our method generalizes earlier work that had relied on using small numbers of handcrafted regular expression patterns to identify hypernym pairs. Using "dependency path" features extracted from parse trees, we introduce a general-purpose formalization and generalization of these patterns. Given a training set of text containing known hypernym pairs, our algorithm automatically extracts useful dependency paths and applies them to new corpora to identify novel pairs. On our evaluation task (determining whether two nouns in a news article participate in a hypernym relationship), our automatically extracted database of hypernyms attains both higher precision and higher recall than WordNet.
Neural Information Processing Systems
Dec-31-2005
- Country:
- North America
- Canada > Alberta (0.14)
- United States > California
- Santa Clara County (0.15)
- North America
- Genre:
- Research Report (0.32)
- Industry:
- Technology: