AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

QNLP in Practice: Running Compositional Models of Meaning on a Quantum Computer

Lorenz, Robin (a:1:{s:5:"en_US";s:17:"Cambridge Quantum";}) | Pearson, Anna (Quantinuum) | Meichanetzidis, Konstantinos (Quantinuum) | Kartsaklis, Dimitri (Quantinuum) | Coecke, Bob (Quantinuum)

Journal of Artificial Intelligence ResearchApr-30-2023

Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first NLP experiments conducted on Noisy Intermediate-Scale Quantum (NISQ) computers for datasets of size greater than 100 sentences. Exploiting the formal similarity of the compositional model of meaning by Coecke, Sadrzadeh, and Clark (2010) with quantum theory, we create representations for sentences that have a natural mapping to quantum circuits. We use these representations to implement and successfully train NLP models that solve simple sentence classification tasks on quantum hardware. We conduct quantum simulations that compare the syntax-sensitive model of Coecke et al. with two baselines that use less or no syntax; specifically, we implement the quantum analogues of a "bag-of-words" model, where syntax is not taken into account at all, and of a word-sequence model, where only word order is respected. We demonstrate that all models converge smoothly both in simulations and when run on quantum hardware, and that the results are the expected ones based on the nature of the tasks and the datasets used. Another important goal of this paper is to describe in a way accessible to AI and NLP researchers the main principles, process and challenges of experiments on quantum hardware. Our aim in doing this is to take the first small steps in this unexplored research territory and pave the way for practical Quantum Natural Language Processing.

dataset, discocat model, qubit, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.14329

AI Access Foundation

14329

Journal of Artificial Intelligence Research

Country:

South America > Colombia > Bogotá D.C. > Bogotá (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Grammatical Error Correction: A Survey of the State of the Art

Bryant, Christopher, Yuan, Zheng, Qorib, Muhammad Reza, Cao, Hannan, Ng, Hwee Tou, Briscoe, Ted

arXiv.org Artificial IntelligenceApr-29-2023

Grammatical Error Correction (GEC) is the task of automatically detecting and correcting errors in text. The task not only includes the correction of grammatical errors, such as missing prepositions and mismatched subject-verb agreement, but also orthographic and semantic errors, such as misspellings and word choice errors respectively. The field has seen significant progress in the last decade, motivated in part by a series of five shared tasks, which drove the development of rule-based methods, statistical classifiers, statistical machine translation, and finally neural machine translation systems which represent the current dominant state of the art. In this survey paper, we condense the field into a single article and first outline some of the linguistic challenges of the task, introduce the most popular datasets that are available to researchers (for both English and other languages), and summarise the various methods and techniques that have been developed with a particular focus on artificial error generation. We next describe the many different approaches to evaluation as well as concerns surrounding metric reliability, especially in relation to subjective human judgements, before concluding with an overview of recent progress and suggestions for future work and remaining challenges. We hope that this survey will serve as comprehensive resource for researchers who are new to the field or who want to be kept apprised of recent developments.

correction, error correction, proceedings, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1162/coli_a_00478

2211.05166

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Singapore (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry:

Education > Educational Setting (0.92)
Education > Educational Technology > Educational Software (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents

Johnson, Daniel D., Tarlow, Daniel, Walder, Christian

arXiv.org Artificial IntelligenceApr-28-2023

Large language models show impressive results at predicting structured text such as code, but also commonly introduce errors and hallucinations in their output. When used to assist software developers, these models may make mistakes that users must go back and fix, or worse, introduce subtle bugs that users may miss entirely. We propose Randomized Utility-driven Synthesis of Uncertain REgions (R-U-SURE), an approach for building uncertainty-aware suggestions based on a decision-theoretic model of goal-conditioned utility, using random samples from a generative model as a proxy for the unobserved possible intents of the end user. Our technique combines minimum-Bayes-risk decoding, dual decomposition, and decision diagrams in order to efficiently produce structured uncertainty summaries, given only sample access to an arbitrary generative model of code and an optional AST parser. We demonstrate R-U-SURE on three developer-assistance tasks, and show that it can be applied different user interaction patterns without retraining the model and leads to more accurate uncertainty estimates than token-probability baselines. We also release our implementation as an open-source library at https://github.com/google-research/r_u_sure.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2303.00732

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre:

Overview (0.67)
Research Report (0.63)

Industry:

Information Technology (0.47)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
(2 more...)

Add feedback

CED: Catalog Extraction from Documents

Zhu, Tong, Zhang, Guoliang, Li, Zechang, Yu, Zijian, Ren, Junfei, Wu, Mengsong, Wang, Zhefeng, Huai, Baoxing, Chao, Pingfu, Chen, Wenliang

arXiv.org Artificial IntelligenceApr-28-2023

Sentence-by-sentence information extraction from long documents is an exhausting and error-prone task. As the indicator of document skeleton, catalogs naturally chunk documents into segments and provide informative cascade semantics, which can help to reduce the search space. Despite their usefulness, catalogs are hard to be extracted without the assist from external knowledge. For documents that adhere to a specific template, regular expressions are practical to extract catalogs. However, handcrafted heuristics are not applicable when processing documents from different sources with diverse formats. To address this problem, we build a large manually annotated corpus, which is the first dataset for the Catalog Extraction from Documents (CED) task. Based on this corpus, we propose a transition-based framework for parsing documents into catalog trees. The experimental results demonstrate that our proposed method outperforms baseline systems and shows a good ability to transfer. We believe the CED task could fill the gap between raw text segments and information extraction tasks on extremely long documents. Data and code are available at \url{https://github.com/Spico197/CatalogExtraction}

computational linguistic, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2304.14662

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(6 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.74)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Add feedback

Cross-Domain Evaluation of POS Taggers: From Wall Street Journal to Fandom Wiki

Hansen, Kia Kirstein, van der Goot, Rob

arXiv.org Artificial IntelligenceApr-27-2023

The Wall Street Journal section of the Penn Treebank has been the de-facto standard for evaluating POS taggers for a long time, and accuracies over 97\% have been reported. However, less is known about out-of-domain tagger performance, especially with fine-grained label sets. Using data from Elder Scrolls Fandom, a wiki about the \textit{Elder Scrolls} video game universe, we create a modest dataset for qualitatively evaluating the cross-domain performance of two POS taggers: the Stanford tagger (Toutanova et al. 2003) and Bilty (Plank et al. 2016), both trained on WSJ. Our analyses show that performance on tokens seen during training is almost as good as in-domain performance, but accuracy on unknown tokens decreases from 90.37% to 78.37% (Stanford) and 87.84\% to 80.41\% (Bilty) across domains. Both taggers struggle with proper nouns and inconsistent capitalization.

machine learning, natural language, tagger, (17 more...)

arXiv.org Artificial Intelligence

2304.13989

Country: Europe > Denmark > Capital Region > Copenhagen (0.05)

Genre: Research Report (0.50)

Industry: Media > News (0.61)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Programmatically Grounded, Compositionally Generalizable Robotic Manipulation

Wang, Renhao, Mao, Jiayuan, Hsu, Joy, Zhao, Hang, Wu, Jiajun, Gao, Yang

arXiv.org Artificial IntelligenceApr-26-2023

Robots operating in the real world require both rich manipulation skills as well as the ability to semantically reason about when to apply those skills. Towards this goal, recent works have integrated semantic representations from large-scale pretrained vision-language (VL) models into manipulation models, imparting them with more general reasoning capabilities. However, we show that the conventional pretraining-finetuning pipeline for integrating such representations entangles the learning of domain-specific action information and domain-general visual information, leading to less data-efficient training and poor generalization to unseen objects and tasks. To this end, we propose ProgramPort, a modular approach to better leverage pretrained VL models by exploiting the syntactic and semantic structures of language instructions. Our framework uses a semantic parser to recover an executable program, composed of functional modules grounded on vision and action across different modalities. Each functional module is realized as a combination of deterministic computation and learnable neural networks. Program execution produces parameters to general manipulation primitives for a robotic end-effector. The entire modular network can be trained with end-to-end imitation learning objectives. Experiments show that our model successfully disentangles action and perception, translating to improved zero-shot and compositional generalization in a variety of manipulation behaviors. Project webpage at: \url{https://progport.github.io}.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2304.13826

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Using Implicit Feedback to Improve Question Generation

Rodrigues, Hugo, Nyberg, Eric, Coheur, Luisa

arXiv.org Artificial IntelligenceApr-26-2023

Question Generation (QG) is a task of Natural Language Processing (NLP) that aims at automatically generating questions from text. Many applications can benefit from automatically generated questions, but often it is necessary to curate those questions, either by selecting or editing them. This task is informative on its own, but it is typically done post-generation, and, thus, the effort is wasted. In addition, most existing systems cannot incorporate this feedback back into them easily. In this work, we present a system, GEN, that learns from such (implicit) feedback. Following a pattern-based approach, it takes as input a small set of sentence/question pairs and creates patterns which are then applied to new unseen sentences. Each generated question, after being corrected by the user, is used as a new seed in the next iteration, so more patterns are created each time. We also take advantage of the corrections made by the user to score the patterns and therefore rank the generated questions. Results show that GEN is able to improve by learning from both levels of implicit feedback when compared to the version with no learning, considering the top 5, 10, and 20 questions. Improvements go up from 10%, depending on the metric and strategy used.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2304.13664

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
(16 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

AMR Parsing with Instruction Fine-tuned Pre-trained Language Models

Lee, Young-Suk, Astudillo, Ramón Fernandez, Florian, Radu, Naseem, Tahira, Roukos, Salim

arXiv.org Artificial IntelligenceApr-24-2023

Instruction fine-tuned language models on a collection of instruction annotated datasets (FLAN) have shown highly effective to improve model performance and generalization to unseen tasks. However, a majority of standard parsing tasks including abstract meaning representation (AMR), universal dependency (UD), semantic role labeling (SRL) has been excluded from the FLAN collections for both model training and evaluations. In this paper, we take one of such instruction fine-tuned pre-trained language models, i.e. FLAN-T5, and fine-tune them for AMR parsing. Our extensive experiments on various AMR parsing tasks including AMR2.0, AMR3.0 and BioAMR indicate that FLAN-T5 fine-tuned models out-perform previous state-of-the-art models across all tasks. In addition, full fine-tuning followed by the parameter efficient fine-tuning, LoRA, further improves the model performances, setting new state-of-the-arts in Smatch on AMR2.0 (86.4), AMR3.0 (84.9) and BioAMR (82.3).

artificial intelligence, fine-tuning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2304.12272

Country:

Asia > Taiwan (0.05)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages

Blaschke, Verena, Schütze, Hinrich, Plank, Barbara

arXiv.org Artificial IntelligenceApr-20-2023

One of the challenges with finetuning pretrained language models (PLMs) is that their tokenizer is optimized for the language(s) it was pretrained on, but brittle when it comes to previously unseen variations in the data. This can for instance be observed when finetuning PLMs on one language and evaluating them on data in a closely related language variety with no standardized orthography. Despite the high linguistic similarity, tokenization no longer corresponds to meaningful representations of the target data, leading to low performance in, e.g., part-of-speech tagging. In this work, we finetune PLMs on seven languages from three different families and analyze their zero-shot performance on closely related, non-standardized varieties. We consider different measures for the divergence in the tokenization of the source and target data, and the way they can be adjusted by manipulating the tokenization during the finetuning step. Overall, we find that the similarity between the percentage of words that get split into subwords in the source and target data (the split word ratio difference) is the strongest predictor for model performance on target data.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2304.10158

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Iceland > Capital Region > Reykjavik (0.04)
(20 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Add feedback

On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training

Fei, Hao, Chua, Tat-Seng, Li, Chenliang, Ji, Donghong, Zhang, Meishan, Ren, Yafeng

arXiv.org Artificial IntelligenceApr-19-2023

Aspect-based sentiment analysis (ABSA) aims at automatically inferring the specific sentiment polarities toward certain aspects of products or services behind the social media texts or reviews, which has been a fundamental application to the real-world society. Since the early 2010s, ABSA has achieved extraordinarily high accuracy with various deep neural models. However, existing ABSA models with strong in-house performances may fail to generalize to some challenging cases where the contexts are variable, i.e., low robustness to real-world environments. In this study, we propose to enhance the ABSA robustness by systematically rethinking the bottlenecks from all possible angles, including model, data, and training. First, we strengthen the current best-robust syntax-aware models by further incorporating the rich external syntactic dependencies and the labels with aspect simultaneously with a universal-syntax graph convolutional network. In the corpus perspective, we propose to automatically induce high-quality synthetic training data with various types, allowing models to learn sufficient inductive bias for better robustness. Last, we based on the rich pseudo data perform adversarial training to enhance the resistance to the context perturbation and meanwhile employ contrastive learning to reinforce the representations of instances with contrastive sentiments. Extensive robustness evaluations are conducted. The results demonstrate that our enhanced syntax-aware model achieves better robustness performances than all the state-of-the-art baselines. By additionally incorporating our synthetic corpus, the robust testing results are pushed with around 10% accuracy, which are then further improved by installing the advanced training strategies. In-depth analyses are presented for revealing the factors influencing the ABSA robustness.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3564281

2304.09563

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Add feedback