Grammars & Parsing
Multi-Task Learning For Parsing The Alexa Meaning Representation Language
Perera, Vittorio (Carnegie Mellon University) | Chung, Tagyoung (Amazon Inc.) | Kollar, Thomas (Amazon Inc.) | Strubell, Emma (University of Massachusetts Amherst)
The Alexa Meaning Representation Language (AMRL) is a compositional graph-based semantic representation that includes fine-grained types, properties, actions, and roles and can represent a wide variety of spoken language. ย AMRL increases the ability of virtual assistants to represent more complex requests, including logical and conditional statements as well as ones with nested clauses. Due to this representational capacity, the acquisition of large scale data resources is challenging, which limits the accuracy ofย resulting models. This paper has two primary contributions. First, we develop aย linearization ofย AMRL graphs along with a deep multi-task model that predictsย fine-grained types, properties, and intents. Second, we show how to jointly train a model that predicts an existing representation for spoken language understanding (SLU) along with the linearized AMRL parse. The resulting model, which leverages learned embeddings from both tasks, is able to predict the AMRLย representationย more accurately than other approaches, decreasing the errorย rates in the fullย parse by 3.56% absolute and reducing the amount of nativelyย annotated dataย needed to train accurate parsing models.
Jointly Parse and Fragment Ungrammatical Sentences
Hashemi, Homa B. (University of Pittsburgh) | Hwa, Rebecca (University of Pittsburgh)
However, the sentences under analysis may experiments, we find that both joint methods produce tree not always be grammatically correct. When a dependency fragment sets that are more similar to those produced by the parser nonetheless produces fully connected, syntactically oracle method than the previous pipeline method; moreover, well-formed trees for these sentences, the trees may be inappropriate the seq2seq method's pruning decision has a significantly and lead to errors. In fact, researchers have raised higher accuracy. In terms of downstream applications, we valid questions about the merit of annotating dependency show that dependency arc pruning is helpful for two applications: trees for ungrammatical sentences (Ragheb and Dickinson sentential grammaticality judgment and semantic role 2012; Cahill 2015). On the other hand, previous work has labeling.
Learning to Predict Readability Using Eye-Movement Data From Natives and Learners
Gonzรกlez-Garduรฑo, Ana V. (University of Copenhagen) | Sรธgaard, Anders (University of Copenhagen)
Readability assessment can improve the quality of assisting technologies aimed at language learners. Eye-tracking data has been used for both inducing and evaluating general-purpose NLP/AI models, and below we show that unsurprisingly, gaze data from language learners can also improve multi-task readability assessment models. This is unsurprising, since the gaze data records the reading difficulties ofthe learners. Unfortunately, eye-tracking data from language learners is often much harder to obtain than eye-tracking data from native speakers. We therefore compare the performance of deep learning readability models that use nativespeaker eye movement data to models using data from language learners. Somewhat surprisingly, we observe no significant drop in performance when replacing learners with natives, making approaches that rely on native speaker gaze information, more scalable. In other words, our finding is that language learner difficulties can be efficiently estimated from native speakers, which suggests that, more generally, readily available gaze data can be used to improve educational NLP/AI models targeted towards language learners.
Deep Semantic Role Labeling With Self-Attention
Tan, Zhixing (Xiamen University) | Wang, Mingxuan (Tencent Technology) | Xie, Jun (Tencent Technology) | Chen, Yidong (Xiamen University) | Shi, Xiaodong (Xiamen University)
Semantic Role Labeling (SRL) is believed to be a crucial step towards natural language understanding and has been widely studied. Recent years, end-to-end SRL with recurrent neural networks (RNN) has gained increasing attention. However, it remains a major challenge for RNNs to handle structural information and long range dependencies. In this paper, we present a simple and effective architecture for SRL which aims to address these problems. Our model is based on self-attention which can directly capture the relationships between two tokens regardless of their distance. Our single model achieves F1=83.4 on the CoNLL-2005 shared task dataset and F1=82.7 on the CoNLL-2012 shared task dataset, which outperforms the previous state-of-the-art results by 1.8 and 1.0 F1 score respectively. Besides, our model is computationally efficient, and the parsing speed is 50K tokens per second on a single Titan X GPU.
AMR Parsing With Cache Transition Systems
Peng, Xiaochang (University of Rochester) | Gildea, Daniel (University of Rochester) | Satta, Giorgio (University of Padua)
In this paper, we present a transition system that generalizes transition-based dependency parsing techniques to generate AMR graphs rather than tree structures. In addition to a buffer and a stack, we use a fixed-size cache, and allow the system to build arcs to any vertices present in the cache at the same time. The size of the cache provides a parameter that can trade off between the complexity of the graphs that can be built and the ease of predicting actions during parsing. Our results show that a cache transition system can cover almost all AMR graphs with a small cache size, and our end-to-end system achieves competitive results in comparison with other transition-based approaches for AMR parsing.
Improving Sequence-to-Sequence Constituency Parsing
Liu, Lemao (Tencent AI Lab) | Zhu, Muhua (Tencent AI Lab) | Shi, Shuming
Sequence-to-sequence constituency parsing casts the tree structured prediction problem as a general sequential problem by top-down tree linearization,and thus it is very easy to train in parallel with distributed facilities. Despite its success, it relies on a probabilistic attention mechanism for a general purpose, which can not guarantee the selected context to be informative in the specific parsing scenario. Previous work introduced a deterministic attention to select the informative context for sequence-to-sequence parsing, but it is based on the bottom-up linearization even if it was observed that top-down linearization is better than bottom-up linearization for standard sequence-to-sequence constituency parsing. In this paper, we thereby extend the deterministic attention to directly conduct on the top-down tree linearization. Intensive experiments show that our parser delivers substantial improvements over the bottom-up linearization in accuracy, and it achieves 92.3 Fscore on the Penn English Treebank section 23 and 85.4 Fscore on the Penn Chinese Treebank test dataset, without reranking or semi-supervised training.
Faithful to the Original: Fact Aware Neural Abstractive Summarization
Cao, Ziqiang (The Hong Kong Polytechnic University) | Wei, Furu (Microsoft Research Asia) | Li, Wenjie (The Hong Kong Polytechnic University) | Li, Sujian (Peking University)
Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system. To avoid generating fake facts in a summary, we leverage open information extraction and dependency parse technologies to extract actual fact descriptions from the source text. The dual-attention sequence-to-sequence framework is then proposed to force the generation conditioned on both the source text and the extracted fact descriptions. Experiments on the Gigaword benchmark dataset demonstrate that our model can greatly reduce fake summaries by 80%. Notably, the fact descriptions also bring significant improvement on informativeness since they often condense the meaning of the source text.
Effective Broad-Coverage Deep Parsing
Allen, James F. (IHMC) | Bahkshandeh, Omid (IHMC) | Beaumont, William de (IHMC) | Galescu, Lucian (IHMC) | Teng, Choh Man (IHMC)
Current semantic parsers either compute shallow representations over a wide range of input, or deeper representations in very limited domains. We describe a system that provides broad-coverage, deep semantic parsing designed to work in any domain using a core domain-general lexicon, ontology and grammar. This paper discusses how this core system can be customized for a particularly challenging domain, namely reading research papers in biology. We evaluate these customizations with some ablation experiments
Learning From Unannotated QA Pairs to Analogically Disambiguate and Answer Questions
Crouse, Maxwell (Northwestern University) | McFate, Clifton (Northwestern University) | Forbus, Kenneth (Northwestern University)
Creating systems that can learn to answer natural language questions has been a longstanding challenge for artificial intelligence. Most prior approaches focused on producing a specialized language system for a particular domain and dataset, and they required training on a large corpus manually annotated with logical forms. This paper introduces an analogy-based approach that instead adapts an existing general purpose semantic parser to answer questions in a novel domain by jointly learning disambiguation heuristics and query construction templates from purely textual question-answer pairs. Our technique uses possible semantic interpretations of the natural language questions and answers to constrain a query-generation procedure, producing cases during training that are subsequently reused via analogical retrieval and composed to answer test questions. Bootstrapping an existing semantic parser in this way significantly reduces the number of training examples needed to accurately answer questions. We demonstrate the efficacy of our technique using the Geoquery corpus, on which it approaches state of the art performance using 10-fold cross validation, shows little decrease in performance with 2-folds, and achieves above 50% accuracy with as few as 10 examples.
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Aditya, Somak (Arizona State University) | Yang, Yezhou (Arizona State University) | Baral, Chitta (Arizona State University)
Many vision and language tasks require commonsense reasoning beyond data-driven image and natural language processing. Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image. Current state-of-the-art systems attempted to solve the task using deep neural architectures and achieved promising performance. However, the resulting systems are generally opaque and they struggle in understanding questions for which extra knowledge is required. In this paper, we present an explicit reasoning layer on top of a set of penultimate neural network based systems. The reasoning layer enables reasoning and answering questions where additional knowledge is required, and at the same time provides an interpretable interface to the end users. Specifically, the reasoning layer adopts a Probabilistic Soft Logic (PSL) based engine to reason over a basket of inputs: visual relations, the semantic parse of the question, and background ontological knowledge from word2vec and ConceptNet. Experimental analysis of the answers and the key evidential predicates generated on the VQA dataset validate our approach.