Grammars & Parsing
Byte-Pair Encoding for Text-to-SQL Generation
Müller, Samuel, Vlachos, Andreas
Neural sequence-to-sequence models provide a competitive approach to the task of mapping a question in natural language to an SQL query, also referred to as text-to-SQL generation. The Byte-Pair Encoding algorithm (BPE) has previously been used to improve machine translation (MT) between natural languages. In this work, we adapt BPE for text-to-SQL generation. As the datasets for this task are rather small compared to MT, we present a novel stopping criterion that prevents overfitting the BPE encoding to the training set. Additionally, we present AST BPE, which is a version of BPE that uses the Abstract Syntax Tree (AST) of the SQL statement to guide BPE merges and therefore produce BPE encodings that generalize better. W e improved the accuracy of a strong attentive seq2seq baseline on five out of six English text-to-SQL tasks while reducing training time by more than 50% on four of them due to the shortened targets. Finally, on two of these tasks we exceeded previously reported accuracies.
Towards Computing Inferences from English News Headlines
George, Elizabeth Jasmi, Mamidi, Radhika
Newspapers are a popular form of written discourse, read by many people, thanks to the novelty of the information provided by the news content in it. A headline is the most widely read part of any newspaper due to its ap - pearance in a bigger font and sometimes in colour print. In this paper, we sug - gest and implement a method for computing inferences from English news headlines, excluding the information from the context in which the headlines appear. This method attempts to generate the possible assumptions a reader formulates in mind upon reading a fresh headline. The generated inferences could be useful for assessing the impact of the news headline on readers includ - ing children. The understandability of the current state of social affairs depends greatly on the assimilation of the headlines. As the inferences that are indepen - dent of the context depend mainly on the syntax of the headline, dependency trees of headlines are used in this approach, to find the syntactical structure of the headlines and to compute inferences out of them.
Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving
Schlag, Imanol, Smolensky, Paul, Fernandez, Roland, Jojic, Nebojsa, Schmidhuber, Jürgen, Gao, Jianfeng
A BSTRACT We incorporate Tensor-Product Representations within the Transformer in order to better support the explicit representation of relation structure. Our Tensor-Product Transformer (TP-Transformer) sets a new state of the art on the recently-introduced Mathematics Dataset containing 56 categories of free-form math word-problems. The essential component of the model is a novel attention mechanism, called TP-Attention, which explicitly encodes the relations between each Transformer cell and the other cells from which values have been retrieved by attention. TP-Attention goes beyond linear combination of retrieved values, strengthening representation-building and resolving ambiguities introduced by multiple layers of standard attention. The TP-Transformer's attention maps give better insights into how it is capable of solving the Mathematics Dataset's challenging problems. Pretrained models and code will be made available after publication. 1 I NTRODUCTION In this paper we propose a variation of the Transformer (V aswani et al., 2017) that is designed to allow it to better incorporate structure into its representations. We test the proposal on a task where structured representations are expected to be particularly helpful: math word-problem solving, where, among other things, correctly parsing expressions and compositionally evaluating them is crucial.
How to build a Knowledge Graph from Text Using spaCy
Lionel Messi needs no introduction. Even folks who don't follow football have heard about the brilliance of one of the greatest players to have graced the sport. We have text, tons of hyperlinks, and even an audio clip. The possibilities of putting this into a use case are endless. However, there is a slight problem. This is not an ideal source of data to feed to our machines.
Knowledge-guided Unsupervised Rhetorical Parsing for Text Summarization
Automatic text summarization (ATS) has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale corpora. To make the summarization results more faithful, this paper presents an unsupervised approach that combines rhetorical structure theory, deep neural model and domain knowledge concern for ATS. This architecture mainly contains three components: domain knowledge base construction based on representation learning, attentional encoder-decoder model for rhetorical parsing and subroutine-based model for text summarization. Domain knowledge can be effectively used for unsupervised rhetorical parsing thus rhetorical structure trees for each document can be derived. In the unsupervised rhetorical parsing module, the idea of translation was adopted to alleviate the problem of data scarcity. The subroutine-based summarization model purely depends on the derived rhetorical structure trees and can generate content-balanced results. To evaluate the summary results without golden standard, we proposed an unsupervised evaluation metric, whose hyper-parameters were tuned by supervised learning. Experimental results show that, on a large-scale Chinese dataset, our proposed approach can obtain comparable performances compared with existing methods.
Model-based Interactive Semantic Parsing: A Unified Framework and A Text-to-SQL Case Study
Yao, Ziyu, Su, Yu, Sun, Huan, Yih, Wen-tau
As a promising paradigm, interactive semantic parsing has shown to improve both semantic parsing accuracy and user confidence in the results. In this paper, we propose a new, unified formulation of the interactive semantic parsing problem, where the goal is to design a model-based intelligent agent. The agent maintains its own state as the current predicted semantic parse, decides whether and where human intervention is needed, and generates a clarification question in natural language. A key part of the agent is a world model: it takes a percept (either an initial question or subsequent feedback from the user) and transitions to a new state. We then propose a simple yet remarkably effective instantiation of our framework, demonstrated on two text-to-SQL datasets (WikiSQL and Spider) with different state-of-the-art base semantic parsers. Compared to an existing interactive semantic parsing approach that treats the base parser as a black box, our approach solicits less user feedback but yields higher run-time accuracy.
Generate relevant leads, Get qualified prospects to make profit!
Have you ever thought of a parser which works without coding? Even a layman can use it with ease. Just keep on clicking and you won't even know when you have parsed your resumes. Data is extracted from resumes to streamline the lead generation process. For an effective lead, you need data about income, job profile, location etc.
On the Possibility of Rewarding Structure Learning Agents: Mutual Information on Linguistic Random Sets
Arroyo-Fernández, Ignacio, Carrasco-Ruíz, Mauricio, Arias-Aguilar, J. Anibal
We present a first attempt to elucidate an Information-Theoretic approach to design the reward provided by a natural language environment to some structure learning agent. To this end, we revisit the Information Theory of unsupervised induction of phrase-structure grammars to characterize the behavior of simulated agents whose actions are characterized in terms of random sets of linguistic samples. Our results showed empirical evidence of that semantic structures (built using Open Information Extraction methods) can be distinguished from randomly constructed structures by observing the Mutual Information among their constituent linguistic random sets. This suggests the possibility of rewarding structure learning agents without using pretrained structural analyzers (oracle actors or experts).
Structural Language Models for Any-Code Generation
Alon, Uri, Sadaka, Roy, Levy, Omer, Yahav, Eran
We address the problem of Any-Code Generation (AnyGen) - generating code without any restriction on the vocabulary or structure. The state-of-the-art in this problem is the sequence-to-sequence (seq2seq) approach, which treats code as a sequence and does not leverage any structural information. We introduce a new approach to AnyGen that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM). SLM estimates the probability of the program's abstract syntax tree (AST) by decomposing it into a product of conditional probabilities over its nodes. We present a neural model that computes these conditional probabilities by considering all AST paths leading to a target node. Unlike previous structural techniques that have severely restricted the kinds of expressions that can be generated, our approach can generate arbitrary expressions in any programming language. Our model significantly outperforms both seq2seq and a variety of existing structured approaches in generating Java and C# code. We make our code, datasets, and models available online.
Automatically Learning Data Augmentation Policies for Dialogue Tasks
Automatic data augmentation (AutoAugment) (Cubuk et al., 2019) searches for optimal perturbation policies via a controller trained using performance rewards of a sampled policy on the target task, hence reducing data-level model bias. While being a powerful algorithm, their work has focused on computer vision tasks, where it is comparatively easy to apply imperceptible perturbations without changing an image's semantic meaning. In our work, we adapt AutoAugment to automatically discover effective perturbation policies for natural language processing (NLP) tasks such as dialogue generation. We start with a pool of atomic operations that apply subtle semantic-preserving perturbations to the source inputs of a dialogue task (e.g., different POS-tag types of stopword dropout, grammatical errors, and paraphrasing). Next, we allow the controller to learn more complex augmentation policies by searching over the space of the various combinations of these atomic operations. Moreover, we also explore conditioning the controller on the source inputs of the target task, since certain strategies may not apply to inputs that do not contain that strategy's required linguistic features. Empirically, we demonstrate that both our input-agnostic and input-aware controllers discover useful data augmentation policies, and achieve significant improvements over the previous state-of-the-art, including trained on manually-designed policies.