Deep Learning
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
A Hybrid Neural Model for Type Classification of Entity Mentions
Dong, Li (Beihang University) | Wei, Furu (Microsoft Research) | Sun, Hong (Microsoft Corporation) | Zhou, Ming (Microsoft Research) | Xu, Ke (Beihang University)
The semantic class (i.e., type) of an entity plays a vital role in many natural language processing tasks, such as question answering. However, most of existing type classification systems extensively rely on hand-crafted features. This paper introduces a hybrid neural model which classifies entity mentions to a wide-coverage set of 22 types derived from DBpedia. It consists of two parts. The mention model uses recurrent neural networks to recursively obtain the vector representation of an entity mention from the words it contains. The context model, on the other hand, employs multilayer perceptrons to obtain the hidden representation for contextual information of a mention. Representations obtained by the two parts are used together to predict the type distribution. Using automatically generated data, these two parts are jointly learned. Experimental studies illustrate that the proposed approach outperforms baseline methods. Moreover, when type information provided by our method is used in a question answering system, we observe a 14.7% relative improvement for the top-1 accuracy of answers.
Supervised Representation Learning: Transfer Learning with Deep Autoencoders
Zhuang, Fuzhen (Chinese Academy of Sciences) | Cheng, Xiaohu (Chinese Academy of Sciences) | Luo, Ping (Chinese Academy of Sciences) | Pan, Sinno Jialin (Nanyang Technological University) | He, Qing (Chinese Academy of Sciences)
Transfer learning has attracted a lot of attention in the past decade. One crucial research issue in transfer learning is how to find a good representation for instances of different domains such that the divergence between domains can be reduced with the new representation. Recently, deep learning has been proposed to learn more robust or higher-level features for transfer learning. However, to the best of our knowledge, most of the previous approaches neither minimize the difference between domains explicitly nor encode label information in learning the representation. In this paper, we propose a supervised representation learning method based on deep autoencoders for transfer learning. The proposed deep autoencoder consists of two encoding layers: an embedding layer and a label encoding layer. In the embedding layer, the distance in distributions of the embedded instances between the source and target domains is minimized in terms of KL-Divergence. In the label encoding layer, label information of the source domain is encoded using a softmax regression model. Extensive experiments conducted on three real-world image datasets demonstrate the effectiveness of our proposed method compared with several state-of-the-art baseline methods.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.