Collaborating Authors

empirical methods in natural language processing

Structural block driven - enhanced convolutional neural representation for relation extraction Artificial Intelligence

In this paper, we propose a novel lightweight relation extraction approach of structural block driven - convolutional neural learning. Specifically, we detect the essential sequential tokens associated with entities through dependency analysis, named as a structural block, and only encode the block on a block-wise and an inter-block-wise representation, utilizing multi-scale CNNs. This is to 1) eliminate the noisy from irrelevant part of a sentence; meanwhile 2) enhance the relevant block representation with both block-wise and inter-block-wise semantically enriched representation. Our method has the advantage of being independent of long sentence context since we only encode the sequential tokens within a block boundary. Experiments on two datasets i.e., SemEval2010 and KBP37, demonstrate the significant advantages of our method. In particular, we achieve the new state-of-the-art performance on the KBP37 dataset; and comparable performance with the state-of-the-art on the SemEval2010 dataset.

Neural Language Models as Domain-Specific Knowledge Bases


The fundamental challenge of natural language processing (NLP) is resolution of the ambiguity that is present in the meaning of and intent carried by natural language. To resolve ambiguity within a text, algorithms use knowledge from the context within which the text appears. For example, the presence of the sentence "I visited the zoo." before the sentence "I saw a bat" can be used to conclude that bat represents an animal and not a wooden club. While in many situations neighboring text is sufficient for reducing ambiguity, typically it is not sufficient when dealing with text from specialized domains. Processing domain-specific text requires an understanding of a large number of domain-specific concepts and processes that NLP algorithms cannot glean from neighboring text alone.

CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training Artificial Intelligence

Two important tasks at the intersection of knowledge graphs and natural language processing are graph-to-text (G2T) and text-to-graph (T2G) conversion. Due to the difficulty and high cost of data collection, the supervised data available in the two fields are usually on the magnitude of tens of thousands, for example, 18K in the WebNLG dataset, which is far fewer than the millions of data for other tasks such as machine translation. Consequently, deep learning models in these two fields suffer largely from scarce training data. This work presents the first attempt to unsupervised learning of T2G and G2T via cycle training. We present CycleGT, an unsupervised training framework that can bootstrap from fully non-parallel graph and text datasets, iteratively back translate between the two forms, and use a novel pretraining strategy. Experiments on the benchmark WebNLG dataset show that, impressively, our unsupervised model trained on the same amount of data can achieve performance on par with the supervised models. This validates our framework as an effective approach to overcome the data scarcity problem in the fields of G2T and T2G.

Experiments on Paraphrase Identification Using Quora Question Pairs Dataset Artificial Intelligence

We modeled the Quora question pairs dataset to identify a similar question. The dataset that we use is provided by Quora. The task is a binary classification. We tried several methods and algorithms and different approach from previous works. For feature extraction, we used Bag of Words including Count Vectorizer, and Term Frequency-Inverse Document Frequency with unigram for XGBoost and CatBoost. Furthermore, we also experimented with WordPiece tokenizer which improves the model performance significantly. We achieved up to 97 percent accuracy. Code and Dataset.

Noise-robust Named Entity Understanding for Virtual Assistants Artificial Intelligence

Named Entity Understanding (NEU) plays an essential role in interactions between users and voice assistants, since successfully identifying entities and correctly linking them to their standard forms is crucial to understanding the user's intent. NEU is a challenging task in voice assistants due to the ambiguous nature of natural language and because noise introduced by speech transcription and user errors occur frequently in spoken natural language queries. In this paper, we propose an architecture with novel features that jointly solves the recognition of named entities (a.k.a. Named Entity Recognition, or NER) and the resolution to their canonical forms (a.k.a. Entity Linking, or EL). We show that by combining NER and EL information in a joint reranking module, our proposed framework improves accuracy in both tasks. This improved performance and the features that enable it, also lead to better accuracy in downstream tasks, such as domain classification and semantic parsing.

Contrastive Self-Supervised Learning for Commonsense Reasoning Artificial Intelligence

We propose a self-supervised method to solve Pronoun Disambiguation and Winograd Schema Challenge problems. Our approach exploits the characteristic structure of training corpora related to so-called "trigger" words, which are responsible for flipping the answer in pronoun disambiguation. We achieve such commonsense reasoning by constructing pair-wise contrastive auxiliary predictions. To this end, we leverage a mutual exclusive loss regularized by a contrastive margin. Our architecture is based on the recently introduced transformer networks, BERT, that exhibits strong performance on many NLP benchmarks. Empirical results show that our method alleviates the limitation of current supervised approaches for commonsense reasoning. This study opens up avenues for exploiting inexpensive self-supervision to achieve performance gain in commonsense reasoning tasks.

Efficient long-distance relation extraction with DG-SpanBERT Artificial Intelligence

In natural language processing, relation extraction seeks to rationally understand unstructured text. Here, we propose a novel SpanBERT-based graph convolutional network (DG-SpanBERT) that extracts semantic features from a raw sentence using the pre-trained language model SpanBERT and a graph convolutional network to pool latent features. Our DG-SpanBERT model inherits the advantage of SpanBERT on learning rich lexical features from large-scale corpus. It also has the ability to capture long-range relations between entities due to the usage of GCN on dependency tree. The experimental results show that our model outperforms other existing dependency-based and sequence-based models and achieves a state-of-the-art performance on the TACRED dataset.

Variational Question-Answer Pair Generation for Machine Reading Comprehension Artificial Intelligence

We present a deep generative model of question-answer (QA) pairs for machine reading comprehension. We introduce two independent latent random variables into our model in order to diversify answers and questions separately. We also study the effect of explicitly controlling the KL term in the variational lower bound in order to avoid the "posterior collapse" issue, where the model ignores latent variables and generates QA pairs that are almost the same. Our experiments on SQuAD v1.1 showed that variational methods can aid QA pair modeling capacity, and that the controlled KL term can significantly improve diversity while generating high-quality questions and answers comparable to those of the existing systems.

R3: A Reading Comprehension Benchmark Requiring Reasoning Processes Artificial Intelligence

Existing question answering systems can only predict answers without explicit reasoning processes, which hinder their explainability and make us overestimate their ability of understanding and reasoning over natural language. In this work, we propose a novel task of reading comprehension, in which a model is required to provide final answers and reasoning processes. To this end, we introduce a formalism for reasoning over unstructured text, namely Text Reasoning Meaning Representation (TRMR). TRMR consists of three phrases, which is expressive enough to characterize the reasoning process to answer reading comprehension questions. We develop an annotation platform to facilitate TRMR's annotation, and release the R3 dataset, a \textbf{R}eading comprehension benchmark \textbf{R}equiring \textbf{R}easoning processes. R3 contains over 60K pairs of question-answer pairs and their TRMRs. Our dataset is available at: \url{http://anonymous}.

Dead Languages Come to Life

Communications of the ACM

Driven by advanced techniques in machine learning, commercial systems for automated language translation now nearly match the performance of human linguists, and far more efficiently. Google Translate supports 105 languages, from Afrikaans to Zulu, and in addition to printed text it can translate speech, handwriting, and the text found on websites and in images. The methods for doing those things are clever, but the key enabler lies in the huge annotated databases of writings in the various language pairs. A translation from French to English succeeds because the algorithms were trained on millions of actual translation examples. The expectation is that every word or phrase that comes into the system, with its associated rules and patterns of language structure, will have been seen and translated before.