Semantic Networks
Bilingual Distributed Word Representations from Document-Aligned Comparable Data
Vulić, Ivan, Moens, Marie-Francine
We propose a new model for learning bilingual word representations from non-parallel document-aligned data. Following the recent advances in word representation learning, our model learns dense real-valued word vectors, that is, bilingual word embeddings (BWEs). Unlike prior work on inducing BWEs which heavily relied on parallel sentence-aligned corpora and/or readily available translation resources such as dictionaries, the article reveals that BWEs may be learned solely on the basis of document-aligned comparable data without any additional lexical resources nor syntactic information. We present a comparison of our approach with previous state-of-the-art models for learning bilingual word representations from comparable data that rely on the framework of multilingual probabilistic topic modeling (MuPTM), as well as with distributional local context-counting models. We demonstrate the utility of the induced BWEs in two semantic tasks: (1) bilingual lexicon extraction, (2) suggesting word translations in context for polysemous words. Our simple yet effective BWE-based models significantly outperform the MuPTM-based and context-counting representation models from comparable data as well as prior BWE-based models, and acquire the best reported results on both tasks for all three tested language pairs.
Holographic Embeddings of Knowledge Graphs
Nickel, Maximilian, Rosasco, Lorenzo, Poggio, Tomaso
Learning embeddings of entities and relations is an efficient and versatile method to perform machine learning on relational data such as knowledge graphs. In this work, we propose holographic embeddings (HolE) to learn compositional vector space representations of entire knowledge graphs. The proposed method is related to holographic models of associative memory in that it employs circular correlation to create compositional representations. By using correlation as the compositional operator HolE can capture rich interactions but simultaneously remains efficient to compute, easy to train, and scalable to very large datasets. In extensive experiments we show that holographic embeddings are able to outperform state-of-the-art methods for link prediction in knowledge graphs and relational learning benchmark datasets.
A Review of Relational Machine Learning for Knowledge Graphs
Nickel, Maximilian, Murphy, Kevin, Tresp, Volker, Gabrilovich, Evgeniy
In this paper, we provide a review of how such statistical models can be "trained" on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In particular, we discuss two fundamentally different kinds of statistical relational models, both of which can scale to massive datasets. The first is based on latent feature models such as tensor factorization and multiway neural networks. The second is based on mining observable patterns in the graph. We also show how to combine these latent and observable models to get improved modeling power at decreased computational cost. Finally, we discuss how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web. To this end, we also discuss Google's Knowledge Vault project as an example of such combination.
Traversing Knowledge Graphs in Vector Space
Guu, Kelvin, Miller, John, Liang, Percy
Path queries on a knowledge graph can be used to answer compositional questions such as "What languages are spoken by people living in Lisbon?". However, knowledge graphs often have missing facts (edges) which disrupts path queries. Recent models for knowledge base completion impute missing facts by embedding knowledge graphs in vector spaces. We show that these models can be recursively applied to answer path queries, but that they suffer from cascading errors. This motivates a new "compositional" training objective, which dramatically improves all models' ability to answer path queries, in some cases more than doubling accuracy. On a standard knowledge base completion task, we also demonstrate that compositional training acts as a novel form of structural regularization, reliably improving performance across all base models (reducing errors by up to 43%) and achieving new state-of-the-art results.
AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes
Rothe, Sascha, Schütze, Hinrich
We present \textit{AutoExtend}, a system to learn embeddings for synsets and lexemes. It is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The synset/lexeme embeddings obtained live in the same vector space as the word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet as a lexical resource, but AutoExtend can be easily applied to other resources like Freebase. AutoExtend achieves state-of-the-art performance on word similarity and word sense disambiguation tasks.
Enriching Word Embeddings Using Knowledge Graph for Semantic Tagging in Conversational Dialog Systems
Celikyilmaz, Asli (Microsoft) | Hakkani-Tur, Dilek (Microsoft Research) | Pasupat, Panupong (Stanford University) | Sarikaya, Ruhi (Microsoft)
Unsupervised word embeddings provide rich linguistic and conceptual information about words. However, they may provide weak information about domain specific semantic relations for certain tasks such as semantic parsing of natural language queries, where such information about words can be valuable. To encode the prior knowledge about the semantic word relations, we present new method as follows: We extend the neural network based lexical word embedding objective function Mikolov, et.al. 2013 by incorporating the information about relationship between entities that we extract from knowledge bases. Our model can jointly learn lexical word representations from free text enriched by the relational word embeddings from relational data (e.g., Freebase) for each type of entity relations. We empirically show on the task of semantic tagging of natural language queries that our enriched embeddings can provide information about not only short-range syntactic dependencies but also long-range semantic dependencies between words. Using the enriched embeddings, we obtain an average of 2% improvement in F-score compared to the previous baselines.
Learning Distributed Word Representations for Natural Logic Reasoning
Bowman, Samuel R. (Stanford University) | Potts, Christopher (Stanford University) | Manning, Christopher D. (Stanford University)
Natural logic offers a powerful relational conception of meaning that is a natural counterpart to distributed semantic representations, which have proven valuable in a wide range of sophisticated language tasks. However, it remains an open question whether it is possible to train distributed representations to support the rich, diverse logical reasoning captured by natural logic. We address this question using two neural network-based models for learning embeddings: plain neural networks and neural tensor networks. Our experiments evaluate the models' ability to learn the basic algebra of natural logic relations from simulated data and from the WordNet noun graph. The overall positive results are promising for the future of learned distributed representations in the applied modeling of logical semantics.
Sense-Aaware Semantic Analysis: A Multi-Prototype Word Representation Model Using Wikipedia
Wu, Zhaohui (The Pennsylvania State University) | Giles, C. Lee (The Pennsylvania State University)
Human languages are naturally ambiguous, which makes it difficult to automatically understand the semantics of text. Most vector space models (VSM) treat all occurrences of a word as the same and build a single vector to represent the meaning of a word, which fails to capture any ambiguity. We present sense-aware semantic analysis (SaSA), a multi-prototype VSM for word representation based on Wikipedia, which could account for homonymy and polysemy. The "sense-specific'' prototypes of a word are produced by clustering Wikipedia pages based on both local and global contexts of the word in Wikipedia. Experimental evaluations on semantic relatedness for both isolated words and words in sentential contexts and word sense induction demonstrate its effectiveness.
Learning Entity and Relation Embeddings for Knowledge Graph Completion
Lin, Yankai (Tsinghua University) | Liu, Zhiyuan (Tsinghua University) | Sun, Maosong (Tsinghua University) | Liu, Yang (Samsung Research and Development Institute of China) | Zhu, Xuan (Samsung Research and Development Institute of China)
Knowledge graph completion aims to perform link prediction between entities. In this paper, we consider the approach of knowledge graph embeddings. Recently, models such as TransE and TransH build entity and relation embeddings by regarding a relation as translation from head entity to tail entity. We note that these models simply put both entities and relations within the same semantic space. In fact, an entity may have multiple aspects and various relations may focus on different aspects of entities, which makes a common space insufficient for modeling. In this paper, we propose TransR to build entity and relation embeddings in separate entity space and relation spaces. Afterwards, we learn embeddings by first projecting entities from entity space to corresponding relation space and then building translations between projected entities. In experiments, we evaluate our models on three tasks including link prediction, triple classification and relational fact extraction. Experimental results show significant and consistent improvements compared to state-of-the-art baselines including TransE and TransH.