Goto

Collaborating Authors

 Grammars & Parsing


Counterfactual Learning from Human Proofreading Feedback for Semantic Parsing

arXiv.org Machine Learning

In semantic parsing for question-answering, it is often too expensive to collect gold parses or even gold answers as supervision signals. We propose to convert model outputs into a set of human-understandable statements which allow non-expert users to act as proofreaders, providing error markings as learning signals to the parser. Because model outputs were suggested by a historic system, we operate in a counterfactual, or off-policy, learning setup. We introduce new estimators which can effectively leverage the given feedback and which avoid known degeneracies in counterfactual learning, while still being applicable to stochastic gradient optimization for neural semantic parsing. Furthermore, we discuss how our feedback collection method can be seamlessly integrated into deployed virtual personal assistants that embed a semantic parser. Our work is the first to show that semantic parsers can be improved significantly by counterfactual learning from logged human feedback data.


Sentence Encoding with Tree-constrained Relation Networks

arXiv.org Artificial Intelligence

The meaning of a sentence is a function of the relations that hold between its words. We instantiate this relational view of semantics in a series of neural models based on variants of relation networks (RNs) which represent a set of objects (for us, words forming a sentence) in terms of representations of pairs of objects. We propose two extensions to the basic RN model for natural language. First, building on the intuition that not all word pairs are equally informative about the meaning of a sentence, we use constraints based on both supervised and unsupervised dependency syntax to control which relations influence the representation. Second, since higher-order relations are poorly captured by a sum of pairwise relations, we use a recurrent extension of RNs to propagate information so as to form representations of higher order relations. Experiments on sentence classification, sentence pair classification, and machine translation reveal that, while basic RNs are only modestly effective for sentence representation, recurrent RNs with latent syntax are a reliably powerful representational device.


Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective

arXiv.org Artificial Intelligence

This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements. It utilizes UMLS to perform domain adaptation when integrating generic domain NLP tools. It also features stand-off annotations that are specified by positional reference to the original document. We built an interval tree based search engine to efficiently query and retrieve the stand-off annotations by specifying positional requirements. We also developed a utility to convert an inline annotation format to stand-off annotations to enable the reuse of clinical text datasets with inline annotations. We experimented with our system on several NLP facilitated tasks including computational phenotyping for lymphoma patients and semantic relation extraction for clinical notes. These experiments showcased the broader applicability and utility of LAPNLP.


Translating Natural Language to SQL using Pointer-Generator Networks and How Decoding Order Matters

arXiv.org Artificial Intelligence

Translating natural language to SQL queries for table-based question answering is a challenging problem and has received significant attention from the research community. In this work, we extend a pointer-generator and investigate the order-matters problem in semantic parsing for SQL. Even though our model is a straightforward extension of a general-purpose pointer-generator, it outperforms early works for WikiSQL and remains competitive to concurrently introduced, more complex models. Moreover, we provide a deeper investigation of the potential order-matters problem that could arise due to having multiple correct decoding paths, and investigate the use of REINFORCE as well as a dynamic oracle in this context.


Classical Copying versus Quantum Entanglement in Natural Language: The Case of VP-ellipsis

arXiv.org Artificial Intelligence

This paper compares classical copying and quantum entanglement in natural language by considering the case of verb phrase (VP) ellipsis. VP ellipsis is a non-linear linguistic phenomenon that requires the reuse of resources, making it the ideal test case for a comparative study of different copying behaviours in compositional models of natural language. Following the line of research in compositional distributional semantics set out by (Coecke et al., 2010) we develop an extension of the Lambek calculus which admits a controlled form of contraction to deal with the copying of linguistic resources. We then develop two different compositional models of distributional meaning for this calculus. In the first model, we follow the categorical approach of (Coecke et al., 2013) in which a functorial passage sends the proofs of the grammar to linear maps on vector spaces and we use Frobenius algebras to allow for copying. In the second case, we follow the more traditional approach that one finds in categorial grammars, whereby an intermediate step interprets proofs as non-linear lambda terms, using multiple variable occurrences that model classical copying. As a case study, we apply the models to derive different readings of ambiguous elliptical phrases and compare the analyses that each model provides.


Parser Extraction of Triples in Unstructured Text

arXiv.org Artificial Intelligence

The web contains vast repositories of unstructured text. We investigate the opportunity for building a knowledge graph from these text sources. We generate a set of triples which can be used in knowledge gathering and integration. We define the architecture of a language compiler for processing subject-predicate-object triples using the OpenNLP parser. We implement a depth-first search traversal on the POS tagged syntactic tree appending predicate and object information. A parser enables higher precision and higher recall extractions of syntactic relationships across conjunction boundaries. We are able to extract 2-2.5 times the correct extractions of ReVerb. The extractions are used in a variety of semantic web applications and question answering. We verify extraction of 50,000 triples on the ClueWeb dataset.


Semantic Role Labeling for Knowledge Graph Extraction from Text

arXiv.org Artificial Intelligence

This paper introduces TakeFive, a new semantic role labeling method that transforms a text into a frame-oriented knowledge graph. It performs dependency parsing, identifies the words that evoke lexical frames, locates the roles and fillers for each frame, runs coercion techniques, and formalises the results as a knowledge graph. This formal representation complies with the frame semantics used in Framester, a factual-linguistic linked data resource. The obtained precision, recall and F1 values indicate that TakeFive is competitive with other existing methods such as SEMAFOR, Pikes, PathLSTM and FRED. We finally discuss how to combine TakeFive and FRED, obtaining higher values of precision, recall and F1. Keywords: Semantic Role Labeling, Frame Semantics, Framester, Dependency Parsing, Role Oriented Knowledge Graphs 1. Introduction Most knowledge in linked data and knowledge graphs is of a relational nature: people participating in events, products having prices, artifacts with parts, works of art produced by artists, beers sold at a bar, etc. For that reason, a good part of integration and interoperability ends up consisting in aligning relations among heterogeneous schemas and data. This limit makes interoperability difficult.


Exploring Semantic Incrementality with Dynamic Syntax and Vector Space Semantics

arXiv.org Artificial Intelligence

One of the fundamental requirements for models of semantic processing in dialogue is incrementality: a model must reflect how people interpret and generate language at least on a word-by-word basis, and handle phenomena such as fragments, incomplete and jointly-produced utterances. We show that the incremental word-by-word parsing process of Dynamic Syntax (DS) can be assigned a compositional distributional semantics, with the composition operator of DS corresponding to the general operation of tensor contraction from multilinear algebra. We provide abstract semantic decorations for the nodes of DS trees, in terms of vectors, tensors, and sums thereof; using the latter to model the underspecified elements crucial to assigning partial representations during incremental processing. As a working example, we give an instantiation of this theory using plausibility tensors of compositional distributional semantics, and show how our framework can incrementally assign a semantic plausibility measure as it parses phrases and sentences.


Learning to Represent Edits

arXiv.org Machine Learning

We introduce the problem of learning distributed representations of edits. By combining a "neural editor" with an "edit encoder", our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to capture the structure and semantics of edits. We hope that this interesting task and data source will inspire other researchers to work further on this problem.


Weakly Supervised Grammatical Error Correction using Iterative Decoding

arXiv.org Machine Learning

We describe an approach to Grammatical Error Correction (GEC) that is effective at making use of models trained on large amounts of weakly supervised bitext. We train the Transformer sequence-to-sequence model on 4B tokens of Wikipedia revisions and employ an iterative decoding strategy that is tailored to the loosely-supervised nature of the Wikipedia training corpus. Finetuning on the Lang-8 corpus and ensembling yields an F0.5 of 58.3 on the CoNLL'14 benchmark and a GLEU of 62.4 on JFLEG. The combination of weakly supervised training and iterative decoding obtains an F0.5 of 48.2 on CoNLL'14 even without using any labeled GEC data.