Collaborating Authors

Convolutional Neural Network Architectures for Matching Natural Language Sentences

Neural Information Processing Systems

Semantic matching is of central importance to many natural language tasks \cite{bordes2014semantic,RetrievalQA}. A successful matching algorithm needs to adequately model the internal structures of language objects and the interaction between them. As a step toward this goal, we propose convolutional neural network models for matching two sentences, by adapting the convolutional strategy in vision and speech. The proposed models not only nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling, but also capture the rich matching patterns at different levels. Our models are rather generic, requiring no prior knowledge on language, and can hence be applied to matching tasks of different nature and in different languages. The empirical study on a variety of matching tasks demonstrates the efficacy of the proposed model on a variety of matching tasks and its superiority to competitor models.

Text Matching as Image Recognition

AAAI Conferences

Matching two texts is a fundamental problem in many natural language processing tasks. An effective way is to extract meaningful matching patterns from words, phrases, and sentences to produce the matching score. Inspired by the success of convolutional neural network in image recognition, where neurons can capture many complicated patterns based on the extracted elementary visual patterns such as oriented edges and corners, we propose to model text matching as the problem of image recognition. Firstly, a matching matrix whose entries represent the similarities between words is constructed and viewed as an image. Then a convolutional neural network is utilized to capture rich matching patterns in a layer-by-layer way. We show that by resembling the compositional hierarchies of patterns in image recognition, our model can successfully identify salient signals such as n-gram and n-term matchings. Experimental results demonstrate its superiority against the baselines.

Multi-Task Medical Concept Normalization Using Multi-View Convolutional Neural Network

AAAI Conferences

Medical concept normalization is a critical problem in biomedical research and clinical applications. In this paper, we focus on normalizing diagnostic and procedure names in Chinese discharge summaries to standard entities, which is formulated as a semantic matching problem. However, non-standard Chinese expressions, short-text normalization and heterogeneity of tasks pose critical challenges in our problem. This paper presents a general framework which introduces a tensor generator and a novel multi-view convolutional neural network (CNN) with multi-task shared structure to tackle the two tasks simultaneously. We propose that the key to address non-standard expressions and short-text problem is to incorporate a matching tensor with multiple granularities. Then multi-view CNN is adopted to extract semantic matching patterns and learn to synthesize them from different views. Finally, multi-task shared structure allows the model to exploit medical correlations between disease and procedure names to better perform disambiguation tasks. Comprehensive experimental analysis indicates our model outperforms existing baselines which demonstrates the effectiveness of our model.

A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations

AAAI Conferences

Matching natural language sentences is central for many applications such as information retrieval and question answering. Existing deep models rely on a single sentence representation or multiple granularity representations for matching. However, such methods cannot well capture the contextualized local information in the matching process. To tackle this problem, we present a new deep architecture to match two sentences with multiple positional sentence representations. Specifically, each positional sentence representation is a sentence representation at this position, generated by a bidirectional long short term memory (Bi-LSTM). The matching score is finally produced by aggregating interactions between these different positional sentence representations, through k-Max pooling and a multi-layer perceptron. Our model has several advantages: (1) By using Bi-LSTM, rich context of the whole sentence is leveraged to capture the contextualized local information in each positional sentence representation; (2) By matching with multiple positional sentence representations, it is flexible to aggregate different important contextualized local information in a sentence to support the matching; (3) Experiments on different tasks such as question answering and sentence completion demonstrate the superiority of our model.

Adaptive Convolutional Filter Generation for Natural Language Understanding Machine Learning

Convolutional neural networks (CNNs) have recently emerged as a popular building block for natural language processing (NLP). Despite their success, most existing CNN models employed in NLP are not expressive enough, in the sense that all input sentences share the same learned (and static) set of filters. Motivated by this problem, we propose an adaptive convolutional filter generation framework for natural language understanding, by leveraging a meta network to generate input-aware filters. We further generalize our framework to model question-answer sentence pairs and propose an adaptive question answering (AdaQA) model; a novel two-way feature abstraction mechanism is introduced to encapsulate co-dependent sentence representations. We investigate the effectiveness of our framework on document categorization and answer sentence-selection tasks, achieving state-of-the-art performance on several benchmark datasets.