AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

End-to-End Argument Mining over Varying Rhetorical Structures

arXiv.org Artificial IntelligenceJan-20-2024

Rhetorical Structure Theory implies no single discourse interpretation of a text, and the limitations of RST parsers further exacerbate inconsistent parsing of similar structures. Therefore, it is important to take into account that the same argumentative structure can be found in semantically similar texts with varying rhetorical structures. In this work, the differences between paraphrases within the same argument scheme are evaluated from a rhetorical perspective. The study proposes a deep dependency parsing model to assess the connection between rhetorical and argument structures. The model utilizes rhetorical relations; RST structures of paraphrases serve as training data augmentations. The method allows for end-to-end argumentation analysis using a rhetorical tree instead of a word sequence. It is evaluated on the bilingual Microtexts corpus, and the first results on fully-fledged argument parsing for the Russian version of the corpus are reported. The results suggest that argument mining can benefit from multiple variants of discourse structure.

corpus, parser, relation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.findings-acl.209

2401.11218

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(12 more...)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.77)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.69)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.48)

Add feedback

Structured Code Representations Enable Data-Efficient Adaptation of Code Language Models

Agarwal, Mayank, Shen, Yikang, Wang, Bailin, Kim, Yoon, Chen, Jie

arXiv.org Artificial IntelligenceJan-19-2024

Current language models tailored for code tasks often adopt the pre-training-then-fine-tuning paradigm from natural language processing, modeling source code as plain text. This approach, however, overlooks the unambiguous structures inherent in programming languages. In this work, we explore data-efficient adaptation of pre-trained code models by further pre-training and fine-tuning them with program structures. Specifically, we represent programs as parse trees -- also known as concrete syntax trees (CSTs) -- and adapt pre-trained models on serialized CSTs. Although the models that we adapt have been pre-trained only on the surface form of programs, we find that a small amount of continual pre-training and fine-tuning on CSTs without changing the model architecture yields improvements over the baseline approach across various code tasks. The improvements are found to be particularly significant when there are limited training examples, demonstrating the effectiveness of integrating program structures with plain-text representation even when working with backbone models that have not been pre-trained with structures.

dataset, objective, translation, (15 more...)

arXiv.org Artificial Intelligence

2401.10716

Country: Europe > France (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Simple and effective data augmentation for compositional generalization

Yao, Yuekun, Koller, Alexander

arXiv.org Artificial IntelligenceJan-18-2024

Compositional generalization, the ability to predict complex meanings from training on simpler sentences, poses challenges for powerful pretrained seq2seq models. In this paper, we show that data augmentation methods that sample MRs and backtranslate them can be effective for compositional generalization, but only if we sample from the right distribution. Remarkably, sampling from a uniform distribution performs almost as well as sampling from the test distribution, and greatly outperforms earlier methods that sampled from the training distribution. We further conduct experiments to investigate the reason why this happens and where the benefit of such data augmentation methods come from.

generalization, grammar, representation, (16 more...)

arXiv.org Artificial Intelligence

2401.09815

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > Dominican Republic (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Compositional Program Generation for Few-Shot Systematic Generalization

Klinger, Tim, Liu, Luke, Dan, Soham, Crouse, Maxwell, Ram, Parikshit, Gray, Alexander

arXiv.org Artificial IntelligenceJan-18-2024

Compositional generalization is a key ability of humans that enables us to learn new concepts from only a handful examples. Neural machine learning models, including the now ubiquitous Transformers, struggle to generalize in this way, and typically require thousands of examples of a concept during training in order to generalize meaningfully. This difference in ability between humans and artificial neural architectures, motivates this study on a neuro-symbolic architecture called the Compositional Program Generator (CPG). CPG has three key features: \textit{modularity}, \textit{composition}, and \textit{abstraction}, in the form of grammar rules, that enable it to generalize both systematically to new concepts in a few-shot manner, as well as productively by length on various sequence-to-sequence language tasks. For each input, CPG uses a grammar of the input language and a parser to generate a parse in which each grammar rule is assigned its own unique semantic module, a probabilistic copy or substitution program. Instances with the same parse are always processed with the same composed modules, while those with different parses may be processed with different modules. CPG learns parameters for the modules and is able to learn the semantics for new rules and types incrementally, without forgetting or retraining on rules it's already seen. It achieves perfect generalization on both the SCAN and COGS benchmarks using just 14 examples for SCAN and 22 examples for COGS -- state-of-the-art accuracy with a 1000x improvement in sample efficiency.

generalization, grammar, module, (16 more...)

arXiv.org Artificial Intelligence

2309.16467

Country:

Asia > China > Guangxi Province > Nanning (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning

Geng, Saibo, Josifoski, Martin, Peyrard, Maxime, West, Robert

arXiv.org Artificial IntelligenceJan-18-2024

Despite their impressive performance, large language models (LMs) still struggle with reliably generating complex output structures when not finetuned to follow the required output format exactly. To address this issue, grammar-constrained decoding (GCD) can be used to control the generation of LMs, guaranteeing that the output follows a given structure. Most existing GCD methods are, however, limited to specific tasks, such as parsing or code generation. In this work, we demonstrate that formal grammars can describe the output space for a much wider range of tasks and argue that GCD can serve as a unified framework for structured NLP tasks in general. For increased flexibility, we introduce input-dependent grammars, which allow the grammar to depend on the input and thus enable the generation of different output structures for different inputs. We then empirically demonstrate the power and flexibility of GCD-enhanced LMs on (1) information extraction, (2) entity disambiguation, and (3) constituency parsing. Our results indicate that grammar-constrained LMs substantially outperform unconstrained LMs or even beat task-specific finetuned models. Grammar constraints thus hold great promise for harnessing off-the-shelf LMs for a wide range of structured NLP tasks, especially where training data is scarce or finetuning is expensive. Code and data: https://github.com/epfl-dlab/GCD.

computational linguistic, grammar, parse tree, (15 more...)

arXiv.org Artificial Intelligence

2305.13971

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Africa > Burundi > Gitega > Gitega (0.04)
(15 more...)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Morphology and Syntax of the Tamil Language

Sarveswaran, Kengatharaiyer

arXiv.org Artificial IntelligenceJan-16-2024

This paper provides an overview of the morphology and syntax of the Tamil language, focusing on its contemporary usage. The paper also highlights the complexity and richness of Tamil in terms of its morphological and syntactic features, which will be useful for linguists analysing the language and conducting comparative studies. In addition, the paper will be useful for those developing computational resources for the Tamil language. It is proven as a rule-based morphological analyser cum generator and a computational grammar for Tamil have already been developed based on this paper. To enhance accessibility for a broader audience, the analysis is conducted without relying on any specific grammatical formalism.

construction, tamil, verb, (17 more...)

arXiv.org Artificial Intelligence

2401.08367

Country:

Europe > Austria > Vienna (0.14)
Asia > Sri Lanka > Northern Province > Jaffna District > Jaffna (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
(20 more...)

Genre: Overview (0.68)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Quantum Transfer Learning for Acceptability Judgements

Buonaiuto, Giuseppe, Guarasci, Raffaele, Minutolo, Aniello, De Pietro, Giuseppe, Esposito, Massimo

arXiv.org Artificial IntelligenceJan-15-2024

Hybrid quantum-classical classifiers promise to positively impact critical aspects of natural language processing tasks, particularly classification-related ones. Among the possibilities currently investigated, quantum transfer learning, i.e., using a quantum circuit for fine-tuning pre-trained classical models for a specific task, is attracting significant attention as a potential platform for proving quantum advantage. This work shows potential advantages, both in terms of performance and expressiveness, of quantum transfer learning algorithms trained on embedding vectors extracted from a large language model to perform classification on a classical Linguistics task: acceptability judgments. Acceptability judgment is the ability to determine whether a sentence is considered natural and well-formed by a native speaker. The approach has been tested on sentences extracted from ItaCoLa, a corpus that collects Italian sentences labeled with their acceptability judgment. The evaluation phase shows results for the quantum transfer learning pipeline comparable to state-of-the-art classical transfer learning algorithms, proving current quantum computers' capabilities to tackle NLP tasks for ready-to-use applications. Furthermore, a qualitative linguistic analysis, aided by explainable AI methods, reveals the capabilities of quantum transfer learning algorithms to correctly classify complex and more structured sentences, compared to their classical counterpart. This finding sets the ground for a quantifiable quantum advantage in NLP in the near future.

acceptability judgment, classification, quantum transfer learning, (13 more...)

arXiv.org Artificial Intelligence

2401.07777

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy (0.04)
North America > Dominican Republic (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

GWPT: A Green Word-Embedding-based POS Tagger

Wei, Chengwei, Pang, Runqi, Kuo, C. -C. Jay

arXiv.org Artificial IntelligenceJan-15-2024

As a fundamental tool for natural language processing (NLP), the part-of-speech (POS) tagger assigns the POS label to each word in a sentence. A novel lightweight POS tagger based on word embeddings is proposed and named GWPT (green word-embedding-based POS tagger) in this work. Following the green learning (GL) methodology, GWPT contains three modules in cascade: 1) representation learning, 2) feature learning, and 3) decision learning modules. The main novelty of GWPT lies in representation learning. It uses non-contextual or contextual word embeddings, partitions embedding dimension indices into low-, medium-, and high-frequency sets, and represents them with different N-grams. It is shown by experimental results that GWPT offers state-of-the-art accuracies with fewer model parameters and significantly lower computational complexity in both training and inference as compared with deep-learning-based methods.

complexity, dimension, proceedings, (16 more...)

arXiv.org Artificial Intelligence

2401.07475

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Discovering Salient Neurons in Deep NLP Models

Durrani, Nadir, Dalvi, Fahim, Sajjad, Hassan

arXiv.org Artificial IntelligenceJan-14-2024

While a lot of work has been done in understanding representations learned within deep NLP models and what knowledge they capture, little attention has been paid towards individual neurons. We present a technique called as Linguistic Correlation Analysis to extract salient neurons in the model, with respect to any extrinsic property - with the goal of understanding how such a knowledge is preserved within neurons. We carry out a fine-grained analysis to answer the following questions: (i) can we identify subsets of neurons in the network that capture specific linguistic properties? (ii) how localized or distributed neurons are across the network? iii) how redundantly is the information preserved? iv) how fine-tuning pre-trained models towards downstream NLP tasks, impacts the learned linguistic knowledge? iv) how do architectures vary in learning different linguistic properties? Our data-driven, quantitative analysis illuminates interesting findings: (i) we found small subsets of neurons that can predict different linguistic tasks, ii) with neurons capturing basic lexical information (such as suffixation) localized in lower most layers, iii) while those learning complex concepts (such as syntactic role) predominantly in middle and higher layers, iii) that salient linguistic neurons are relocated from higher to lower layers during transfer learning, as the network preserve the higher layers for task specific information, iv) we found interesting differences across pre-trained models, with respect to how linguistic information is preserved within, and v) we found that concept exhibit similar neuron distribution across different languages in the multilingual transformer models. Our code is publicly available as part of the NeuroX toolkit.

computational linguistic, neuron, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2206.13288

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Washington > King County > Seattle (0.14)
(30 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multi-Task Learning for Front-End Text Processing in TTS

Kang, Wonjune, Wang, Yun, Zhang, Shun, Hinsvark, Arthur, He, Qing

arXiv.org Artificial IntelligenceJan-11-2024

We propose a multi-task learning (MTL) model for jointly performing three tasks that are commonly solved in a text-to-speech (TTS) front-end: text normalization (TN), part-of-speech (POS) tagging, and homograph disambiguation (HD). Our framework utilizes a tree-like structure with a trunk that learns shared representations, followed by separate task-specific heads. We further incorporate a pre-trained language model to utilize its built-in lexical and contextual knowledge, and study how to best use its embeddings so as to most effectively benefit our multi-task model. Through task-wise ablations, we show that our full model trained on all three tasks achieves the strongest overall performance compared to models trained on individual or sub-combinations of tasks, confirming the advantages of our MTL framework. Finally, we introduce a new HD dataset containing a balanced number of sentences in diverse contexts for a variety of homographs and their pronunciations. We demonstrate that incorporating this dataset into training significantly improves HD performance over only using a commonly used, but imbalanced, pre-existing dataset.

dataset, proceedings, pronunciation, (14 more...)

arXiv.org Artificial Intelligence

2401.06321

Country:

North America > United States > Massachusetts (0.04)
North America > United States > Colorado (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.67)

Add feedback