Asia
On Conceptual Labeling of a Bag of Words
Sun, Xiangyan (Fudan University) | Xiao, Yanghua (Fudan University) | Wang, Haixun (Google Research) | Wang, Wei (Fudan University)
In natural language processing and information retrieval, the bag of words representation is used to implicitly represent the meaning of the text. Implicit semantics, however, are insufficient in supporting text or natural language based interfaces, which are adopted by an increasing number of applications. Indeed, in applications ranging from automatic ontology construction to question answering, explicit representation of semantics is starting to play a more prominent role. In this paper, we introduce the task of conceptual labeling (CL), which aims at generating a minimum set of conceptual labels that best summarize a bag of words. We draw the labels from a data driven semantic network that contains millions of highly connected concepts. The semantic network provides meaning to the concepts, and in turn, it provides meaning to the bag of words through the conceptual labels we generate. To achieve our goal, we use an information theoretic approach to trade-off the semantic coverage of a bag of words against the minimality of the output labels. Specifically, we use Minimum Description Length (MDL) as the criteria in selecting the best concepts. Our extensive experimental results demonstrate the effectiveness of our approach in representing the explicit semantics of a bag of words.
Towards Addressing the Winograd Schema Challenge — Building and Using a Semantic Parser and a Knowledge Hunting Module
Sharma, Arpit (Arizona State University) | Vo, Nguyen H (Arizona State University) | Aditya, Somak (Arizona State University) | Baral, Chitta (Arizona State University)
Concerned about the Turing test's ability to correctly evaluate if a system exhibits human-like intelligence, the Winograd Schema Challenge (WSC) has been proposed as an alternative. A Winograd Schema consists of a sentence and a question. The answers to the questions are intuitive for humans but are designed to be difficult for machines, as they require various forms of commonsense knowledge about the sentence. In this paper we demonstrate our progress towards addressing the WSC. We present an approach that identifies the knowledge needed to answer a challenge question, hunts down that knowledge from text repositories, and then reasons with them to come up with the answer. In the process we develop a semantic parser (www.kparser.org). We show that our approach works well with respect to a subset of Winograd schemas.
An Active Learning Approach to Coreference Resolution
Sachan, Mrinmaya (Carnegie Mellon University) | Hovy, Eduard (Carnegie Mellon University) | Xing, Eric P. (Carnegie Mellon University)
In this paper, we define the problem of coreference resolution in text as one of clustering with pairwise constraints where human experts are asked to provide pairwise constraints (pairwise judgments of coreferentiality) to guide the clustering process. Positing that these pairwise judgments are easy to obtain from humans given the right context, we show that with significantly lower number of pairwise judgments and feature-engineering effort, we can achieve competitive coreference performance. Further, we describe an active learning strategy that minimizes the overall number of such pairwise judgments needed by asking the most informative questions to human experts at each step of coreference resolution. We evaluate this hypothesis and our algorithms on both entity and event coreference tasks and on two languages.
Convolutional Neural Tensor Network Architecture for Community-Based Question Answering
Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Retrieving similar questions is very important in community-based question answering. A major challenge is the lexical gap in sentence matching. In this paper, we propose a convolutional neural tensor network architecture to encode the sentences in semantic space and model their interactions with a tensor layer. Our model integrates sentence modeling and semantic matching into a single model, which can not only capture the useful information with convolutional and pooling layers, but also learn the matching metrics between the question and its answer. Besides, our model is a general architecture, with no need for the other knowledge such as lexical or syntactic analysis. The experimental results shows that our method outperforms the other methods on two matching tasks.
Integrating Importance, Non-Redundancy and Coherence in Graph-Based Extractive Summarization
Parveen, Daraksha (Heidelberg Institute for Theoretical Studies) | Strube, Michael (Heidelberg Institute for Theoretical Studies)
We propose a graph-based method for extractive single-document summarization which considers importance, non-redundancy and local coherence simultaneously. We represent input documents by means of a bipartite graph consisting of sentence and entity nodes. We rank sentences on the basis of importance by applying a graph-based ranking algorithm to this graph and ensure non-redundancy and local coherence of the summary by means of an optimization step. Our graph based method is applied to scientific articles from the journal PLOS Medicine. We use human judgements to evaluate the coherence of our summaries. We compare ROUGE scores and human judgements for coherence of different systems on scientific articles. Our method performs considerably better than other systems on this data. Also, our graph-based summarization technique achieves state-of-the-art results on DUC 2002 data. Incorporating our local coherence measure always achieves the best results.
Automated Rule Selection for Aspect Extraction in Opinion Mining
Liu, Qian (Southeast University) | Gao, Zhiqiang (Southeast University) | Liu, Bing (University of Illinois at Chicago) | Zhang, Yuanlin (Texas Tech University)
Aspect extraction aims to extract fine-grained opinion targets from opinion texts. Recent work has shown that the syntactical approach, which employs rules about grammar dependency relations between opinion words and aspects, performs quite well. This approach is highly desirable in practice because it is unsupervised and domain independent. However, the rules need to be carefully selected and tuned manually so as not to produce too many errors. Although it is easy to evaluate the accuracy of each rule automatically, it is not easy to select a set of rules that produces the best overall result due to the overlapping coverage of the rules. In this paper, we propose a novel method to select an effective set of rules. To our knowledge, this is the first work that selects rules automatically. Our experiment results show that the proposed method can select a subset of a given rule set to achieve significantly better results than the full rule set and the existing state-of-the-art CRF-based supervised method.
Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model
Liu, Pengfei (Fudan University) | Qiu, Xipeng (Fudan University) | Huang, Xuanjing (Fudan University)
Distributed word representations have a rising interest in NLP community. Most of existing models assume only one vector for each individual word, which ignores polysemy and thus degrades their effectiveness for downstream tasks. To address this problem, some recent work adopts multi-prototype models to learn multiple embeddings per word type. In this paper, we distinguish the different senses of each word by their latent topics. We present a general architecture to learn the word and topic embeddings efficiently, which is an extension to the Skip-Gram model and can model the interaction between words and topics simultaneously. The experiments on the word similarity and text classification tasks show our model outperforms state-of-the-art methods.
Incorporating Domain and Sentiment Supervision in Representation Learning for Domain Adaptation
Liu, Biao (Tsinghua University) | Huang, Minlie (Tsinghua University) | Sun, Jiashen (Samsung Research and Development Institute) | Zhu, Xuan (Samsung Research and Development Institute)
Domain adaptation aims at learning robust classifiers across domains using labeled data from a source domain. Representation learning methods, which project the original features to a new feature space, have been proved to be quite effective for this task. However, these unsupervised methods neglect the domain information of the input and are not specialized for the classification task. In this work, we address two key factors to guide the representation learning process for domain adaptation of sentiment classification — one is domain supervision, enforcing the learned representation to better predict the domain of an input, and the other is sentiment supervision which utilizes the source domain sentiment labels to learn sentiment-favorable representations. Experimental results show that these two factors significantly improve the proposed models as expected.
Reader-Aware Multi-Document Summarization via Sparse Coding
Li, Piji (The Chinese University of Hong Kong) | Bing, Lidong (Carnegie Mellon University) | Lam, Wai (The Chinese University of Hong Kong) | Li, Hang (Huawei Technologies) | Liao, Yi (The Chinese University of Hong Kong)
We propose a new MDS paradigm called reader-aware multi-document summarization (RA-MDS).Specifically, a set of reader comments associated with the news reports are also collected. The generated summaries from the reports for the event should be salient according to not only the reports but also the reader comments. To tackle this RA-MDS problem, we propose a sparse-coding-based method that is able to calculate the salience of the text units by jointly considering news reports and reader comments. Another reader-aware characteristic of our framework is to improve linguistic quality via entity rewriting. The rewriting consideration is jointly assessed together with other summarization requirements under a unified optimization model. To support the generation of compressive summaries via optimization, we explore a finer syntactic unit, namely, noun/verb phrase. In this work, we also generate a data set for conducting RA-MDS. Extensive experiments on this data set and some classical data sets demonstrate the effectiveness of our proposed approach.
Word-Error Correction of Continuous Speech Recognition Based on Normalized Relevance Distance
Fusayasu, Yohei (Kobe University) | Tanaka, Katsuyuki (Kobe University) | Takiguchi, Tetsuya (Kobe University) | Ariki, Yasuo (Kobe University)
In spite of the recent advancements being made in speech recognition, recognition errors are unavoidable in continuous speech recognition. In this paper, we focus on a word-error correction system for continuous speech recognition using confusion networks.Conventional N-gram correction is widely used; however, the performance degrades due to the fact that the N-gram approach cannot measure information between long distance words. In order to improve the performance of the N-gram model, we employ Normalized Relevance Distance (NRD) as a measure for semantic similarity between words. NRD can identify not only co-occurrence but also the correlation of importance of the terms in documents. Even if the words are located far from each other, NRD can estimate the semantic similarity between the words. The effectiveness of our method was evaluated in continuous speech recognition tasks for multiple test speakers. Experimental results show that our error-correction method is the most effective approach as compared to the methods using other features.