The University of Liverpool
Using k-Way Co-Occurrences for Learning Word Embeddings
Bollegala, Danushka (The University of Liverpool) | Yoshida, Yuichi (National Institute of Informatics) | Kawarabayashi, Ken-ichi (National Institute of Informatics)
Co-occurrences between two words provide useful insights into the semantics of those words.Consequently, numerous prior work on word embedding learning has used co-occurrences between two wordsas the training signal for learning word embeddings.However, in natural language texts it is common for multiple words to be related and co-occurring in the same context.We extend the notion of co-occurrences to cover k (โฅ2)-way co-occurrences among a set of k- words.Specifically, we prove a theoretical relationship between the joint probability of k (โฅ2) words, and the sum of l_2 norms of their embeddings. Next, we propose a learning objective motivated by our theoretical resultthat utilises k- way co-occurrences for learning word embeddings.Our experimental results show that the derived theoretical relationship does indeed hold empirically, anddespite data sparsity, for some smaller k (โค5) values, k- way embeddings perform comparably or better than 2-way embeddings in a range of tasks.
Joint Word Representation Learning Using a Corpus and a Semantic Lexicon
Bollegala, Danushka (The University of Liverpool) | Alsuhaibani, Mohammed (The University of Liverpool) | Maehara, Takanori (Shizuoka University) | Kawarabayashi, Ken-ichi (National Institute of Informatics)
Methods for learning word representations using large text corpora have received much attention lately due to their impressive performancein numerous natural language processing (NLP) tasks such as, semantic similarity measurement, and word analogy detection.Despite their success, these data-driven word representation learning methods do not considerthe rich semantic relational structure between words in a co-occurring context. On the other hand, already much manual effort has gone into the construction of semantic lexicons such as the WordNetthat represent the meanings of words by defining the various relationships that exist among the words in a language.We consider the question, can we improve the word representations learnt using a corpora by integrating theknowledge from semantic lexicons?. For this purpose, we propose a joint word representation learning method that simultaneously predictsthe co-occurrences of two words in a sentence subject to the relational constrains given by the semantic lexicon.We use relations that exist between words in the lexicon to regularize the word representations learnt from the corpus.Our proposed method statistically significantly outperforms previously proposed methods for incorporating semantic lexicons into wordrepresentations on several benchmark datasets for semantic similarity and word analogy.
Embedding Semantic Relations into Word Representations
Bollegala, Danushka (The University of Liverpool) | Maehara, Takanori (Shizuoka University) | Kawarabayashi, Ken-ichi (National Institute of Informatics and JST ERATO Kawarabayashi Large Graph Project)
Learning representations for semantic relations is important for various tasks such as analogy detection, relational search, and relation classification.Although there have been several proposals for learning representations for individual words,learning word representations that explicitly capture the semantic relations between words remains under developed.We propose an unsupervised method for learning vector representations for words such that the learnt representations are sensitive to the semantic relations that exist between two words.First, we extract lexical patterns from the co-occurrence contexts of two words in a corpus to represent the semantic relations that exist between those two words.Second, we represent a lexical pattern as the weighted sum of the representations of the words that co-occur with that lexical pattern. Third, we train a binary classifier to detect relationally similar versus non-similar lexical pattern pairs.The proposed method is unsupervised in the sense that the lexical pattern pairs we use as train data are automatically sampled from a corpus, without requiring any manual intervention.Our proposed method statistically significantly outperforms the current state-of-the-art word representations on three benchmark datasets for proportional analogy detection, demonstrating its ability to accurately capture the semantic relations among words.
Learning Word Representations from Relational Graphs
Bollegala, Danushka (The University of Liverpool) | Maehara, Takanori (National Institute of Informatics) | Yoshida, Yuichi (National Institute of Informatics) | Kawarabayashi, Ken-ichi (National Institute of Informatics)
If we already know a particular concept representations by considering the semantic relations between such as pets, we can describe a new concept such as dogs words. Specifically, given as input a relational graph, by stating the semantic relations that the new concept shares a directed labelled weighted graph where vertices represent with the existing concepts such as dogs belongs-to pets. Alternatively, words and edges represent numerous semantic relations we could describe a novel concept by listing all that exist between the corresponding words, we consider the the attributes it shares with existing concepts. In our example, problem of learning a vector representation for each vertex we can describe the concept dog by listing attributes (word) in the graph and a matrix representation for each label such as mammal, carnivorous, and domestic animal that it type (pattern). The learnt word representations are evaluated shares with another concept such as the cat. Therefore, both for their accuracy by using them to solve semantic word attributes and relations can be considered as alternative descriptors analogy questions on a benchmark dataset. of the same knowledge. This close connection between Our task of learning word attributes using relations between attributes and relations can be seen in knowledge representation words is challenging because of several reasons. First, schemes such as predicate logic, where attributes there can be multiple semantic relations between two words.