Knowledge graph embedding represents entities and relations in knowledge graph as low-dimensional, continuous vectors, and thus enables knowledge graph compatible with machine learning models. Though there have been a variety of models for knowledge graph embedding, most methods merely concentrate on the fact triples, while supplementary textual descriptions of entities and relations have not been fully employed. To this end, this paper proposes the semantic space projection (SSP) model which jointly learns from the symbolic triples and textual descriptions. Our model builds interaction between the two information sources, and employs textual descriptions to discover semantic relevance and offer precise semantic embedding. Extensive experiments show that our method achieves substantial improvements against baselines on the tasks of knowledge graph completion and entity classification.
With the rise of social media, learning from informal text has become increasingly important. We present a novel semantic lexicon induction approach that is able to learn new vocabulary from social media. Our method is robust to the idiosyncrasies of informal and open-domain text corpora. Unlike previous work, it does not impose restrictions on the lexical features of candidate terms — e.g. by restricting entries to nouns or noun phrases —while still being able to accurately learn multiword phrases of variable length. Starting with a few seed terms for a semantic category, our method first explores the context around seed terms in a corpus, and identifies context patterns that are relevant to the category. These patterns are used to extract candidate terms — i.e. multiword segments that are further analyzed to ensure meaningful term boundary segmentation. We show that our approach is able to learn high quality semantic lexicons from informally written social media text of Twitter, and can achieve accuracy as high as 92% in the top 100 learned category members.
This paper describes a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of words paired with meaning representations. Wolfie is part of an integrated system that learns to parse novel sentences into semantic representations, such as logical database queries. Experimental results are presented demonstrating Wolfie's ability to learn useful lexicons for a database interface in four different natural languages. The lexicons learned by Wolfie are compared to those acquired by a similar system developed by Siskind (1996).
The purpose of information extraction (IE) systems is to extract domain-specific information from natural language text. IE systems typically rely on two domain-specific resources: a dictionary of extraction patterns and a semantic lexicon. The extraction patterns may be constructed by hand or may be generated automatically using one of several techniques. Most systems that generate extraction patterns automatically use special training resources, such as texts annotated with domain-specific tags (e.g., AutoSlog (Riloff 1993; 1996a), CRYSTAL (Soderland et al. 1995), RAPIER (Califf 1998), SRV (Freitag 1998), WHISK (Soderland 1999)) or manually fined keywords, frames, or object recognizers (e.g., PALKA (Kim & Moldovan 1993) and LIEP (Huffman 1996)). AutoSlog-TS (Riloff 1996b) takes a ferent approach by using a preclassified training corpus in which texts only need to be labeled as relevant In-Copyright Q1999, American Association for Artificial telligence (www.aaai.org).
Every natural language processing (NLP) system has a requirement for lexical information. While there has been considerable progress in developing et icient lexical representations of morphological (Koskenniemi, 1983) and syntactic (Heilwig, 1980; Sleator and Temperley, 1991) information attempts at constructing a wide-coverage lexicon of semantic information have met with considerable difficulty. First, it is very dit icult to devise a general yet powerful semantic representation scheme. Meanings are hard to pin down. Second, even if such a scheme exists, it is not easy to create repre- "The help of Tony Molloy, Redmond O'Brien sad Gemma Rysa is gratefully acknowledged.