Goto

Collaborating Authors

 Grammars & Parsing


AST-Probe: Recovering abstract syntax trees from hidden representations of pre-trained language models

arXiv.org Artificial Intelligence

The objective of pre-trained language models is to learn contextual representations of textual data. Pre-trained language models have become mainstream in natural language processing and code modeling. Using probes, a technique to study the linguistic properties of hidden vector spaces, previous works have shown that these pre-trained language models encode simple linguistic properties in their hidden representations. However, none of the previous work assessed whether these models encode the whole grammatical structure of a programming language. In this paper, we prove the existence of a syntactic subspace, lying in the hidden representations of pre-trained language models, which contain the syntactic information of the programming language. We show that this subspace can be extracted from the models' representations and define a novel probing method, the AST-Probe, that enables recovering the whole abstract syntax tree (AST) of an input code snippet. In our experimentations, we show that this syntactic subspace exists in five state-of-the-art pre-trained language models. In addition, we highlight that the middle layers of the models are the ones that encode most of the AST information. Finally, we estimate the optimal size of this syntactic subspace and show that its dimension is substantially lower than those of the models' representation spaces. This suggests that pre-trained language models use a small part of their representation spaces to encode syntactic information of the programming languages.


That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory

arXiv.org Artificial Intelligence

The evolution of language follows the rule of gradual change. Grammar, vocabulary, and lexical semantic shifts take place over time, resulting in a diachronic linguistic gap. As such, a considerable amount of texts are written in languages of different eras, which creates obstacles for natural language processing tasks, such as word segmentation and machine translation. Although the Chinese language has a long history, previous Chinese natural language processing research has primarily focused on tasks within a specific era. Therefore, we propose a cross-era learning framework for Chinese word segmentation (CWS), CROSSWISE, which uses the Switch-memory (SM) module to incorporate era-specific linguistic knowledge. Experiments on four corpora from different eras show that the performance of each corpus significantly improves. Further analyses also demonstrate that the SM can effectively integrate the knowledge of the eras into the neural network.


Entity Aware Syntax Tree Based Data Augmentation for Natural Language Understanding

arXiv.org Artificial Intelligence

Understanding the intention of the users and recognizing the semantic entities from their sentences, aka natural language understanding (NLU), is the upstream task of many natural language processing tasks. One of the main challenges is to collect a sufficient amount of annotated data to train a model. Existing research about text augmentation does not abundantly consider entity and thus performs badly for NLU tasks. To solve this problem, we propose a novel NLP data augmentation technique, Entity Aware Data Augmentation (EADA), which applies a tree structure, Entity Aware Syntax Tree (EAST), to represent sentences combined with attention on the entity. Our EADA technique automatically constructs an EAST from a small amount of annotated data, and then generates a large number of training instances for intent detection and slot filling. Experimental results on four datasets showed that the proposed technique significantly outperforms the existing data augmentation methods in terms of both accuracy and generalization ability.


Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework

arXiv.org Artificial Intelligence

Action recognition from videos, i.e., classifying a video into one of the pre-defined action types, has been a popular topic in the communities of artificial intelligence, multimedia, and signal processing. However, existing methods usually consider an input video as a whole and learn models, e.g., Convolutional Neural Networks (CNNs), with coarse video-level class labels. These methods can only output an action class for the video, but cannot provide fine-grained and explainable cues to answer why the video shows a specific action. Therefore, researchers start to focus on a new task, Part-level Action Parsing (PAP), which aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video. To this end, we propose a coarse-to-fine framework for this challenging task. In particular, our framework first predicts the video-level class of the input video, then localizes the body parts and predicts the part-level action. Moreover, to balance the accuracy and computation in part-level action parsing, we propose to recognize the part-level actions by segment-level features. Furthermore, to overcome the ambiguity of body parts, we propose a pose-guided positional embedding method to accurately localize body parts. Through comprehensive experiments on a large-scale dataset, i.e., Kinetics-TPS, our framework achieves state-of-the-art performance and outperforms existing methods over a 31.10% ROC score.


Chinese grammatical error correction based on knowledge distillation

arXiv.org Artificial Intelligence

In view of the poor robustness of existing Chinese grammatical error correction models on attack test sets and large model parameters, this paper uses the method of knowledge distillation to compress model parameters and improve the anti-attack ability of the model. In terms of data, the attack test set is constructed by integrating the disturbance into the standard evaluation data set, and the model robustness is evaluated by the attack test set. The experimental results show that the distilled small model can ensure the performance and improve the training speed under the condition of reducing the number of model parameters, and achieve the optimal effect on the attack test set, and the robustness is significantly improved. Code is available at https://github.com/Richard88888/KD-CGEC.


GRILLBot: An Assistant for Real-World Tasks with Neural Semantic Parsing and Graph-Based Representations

arXiv.org Artificial Intelligence

Academic datasets such conversational assistant developed during as MultiWOZ (Budzianowski et al., 2018) capture the 2021/2022 Alexa Prize TaskBot Challenge slot-value pairs from the user utterances within a (Gemmell et al., 2022). GRILLBot aims to be constrained set of domains enabling data-driven an open research platform for complex tasks and neural models. Andreas et al. (2020) extend this supports flexible graph-based task representations, traditional representation towards semantic parsing contextual semantic parsing, and incorporates image with dataflow graphs while constrained to the domain and video content for clarity and instruction. of events booking in the SMCalFlow dataset.


Introduction to Neural Transfer Learning with Transformers for Social Science Text Analysis

arXiv.org Artificial Intelligence

Transformer-based models for transfer learning have the potential to achieve high prediction accuracies on text-based supervised learning tasks with relatively few training data instances. These models are thus likely to benefit social scientists that seek to have as accurate as possible text-based measures but only have limited resources for annotating training data. To enable social scientists to leverage these potential benefits for their research, this paper explains how these methods work, why they might be advantageous, and what their limitations are. Additionally, three Transformer-based models for transfer learning, BERT (Devlin et al. 2019), RoBERTa (Liu et al. 2019), and the Longformer (Beltagy et al. 2020), are compared to conventional machine learning algorithms on three applications. Across all evaluated tasks, textual styles, and training data set sizes, the conventional models are consistently outperformed by transfer learning with Transformers, thereby demonstrating the benefits these models can bring to text-based social science research.


Most Frequently Asked NLP Interview Questions - Analytics Vidhya

#artificialintelligence

This article was published as a part of the Data Science Blogathon. Natural language processing (NLP) is the branch of computer science and, more specifically, the domain of artificial intelligence (AI) that focuses on providing computers the ability to understand written and spoken language in a way similar to that of humans. Combining computational linguistics (rule-based modeling of human language) with statistical, machine learning, and deep learning models is natural language processing (NLP). Together, these technologies enable computers to'understand' the whole meaning of human language in the form of text or speech data, including the speaker's or writer's purpose and emotion. NLP is the driving force behind computer systems that translate text from one language to another, respond to spoken commands, and swiftly summarise massive amounts of information--even in real-time.


A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions

arXiv.org Artificial Intelligence

Text-to-SQL parsing is an essential and challenging task. The goal of text-to-SQL parsing is to convert a natural language (NL) question to its corresponding structured query language (SQL) based on the evidences provided by relational databases. Early text-to-SQL parsing systems from the database community achieved a noticeable progress with the cost of heavy human engineering and user interactions with the systems. In recent years, deep neural networks have significantly advanced this task by neural generation models, which automatically learn a mapping function from an input NL question to an output SQL query. Subsequently, the large pre-trained language models have taken the state-of-the-art of the text-to-SQL parsing task to a new level. In this survey, we present a comprehensive review on deep learning approaches for text-to-SQL parsing. First, we introduce the text-to-SQL parsing corpora which can be categorized as single-turn and multi-turn. Second, we provide a systematical overview of pre-trained language models and existing methods for text-to-SQL parsing. Third, we present readers with the challenges faced by text-to-SQL parsing and explore some potential future directions in this field.


On Unsupervised Training of Link Grammar Based Language Models

arXiv.org Artificial Intelligence

In this short note we explore what is needed for unsupervised training of graph language models based on link grammars. First, we introduce the termination tags formalism required to build a language model based on a link grammar formalism of Sleator and Temperley [21] and discuss the influence of context on the unsupervised learning of link grammars. Second, we propose a statistical link grammar formalism, allowing for statistical language generation. Third, based on the above formalism, we show that the classical dissertation of Yuret [25] on discovery of linguistic relations using lexical attraction ignores contextual properties of the language, and thus the approach to unsupervised language learning relying just on bigrams is flawed. This correlates well with the unimpressive results in unsupervised training of graph language models based on bigram approach of Yuret.