AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

ViANLI: Adversarial Natural Language Inference for Vietnamese

Van Huynh, Tin, Van Nguyen, Kiet, Nguyen, Ngan Luu-Thuy

arXiv.org Artificial IntelligenceJul-1-2024

The development of Natural Language Processing (NLI) datasets and models has been inspired by innovations in annotation design. With the rapid development of machine learning models today, the performance of existing machine learning models has quickly reached state-of-the-art results on a variety of tasks related to natural language processing, including natural language inference tasks. By using a pre-trained model during the annotation process, it is possible to challenge current NLI models by having humans produce premise-hypothesis combinations that the machine model cannot correctly predict. To remain attractive and challenging in the research of natural language inference for Vietnamese, in this paper, we introduce the adversarial NLI dataset to the NLP research community with the name ViANLI. This data set contains more than 10K premise-hypothesis pairs and is built by a continuously adjusting process to obtain the most out of the patterns generated by the annotators. ViANLI dataset has brought many difficulties to many current SOTA models when the accuracy of the most powerful model on the test set only reached 48.4%. Additionally, the experimental results show that the models trained on our dataset have significantly improved the results on other Vietnamese NLI datasets.

dataset, hypothesis sentence, springer nature 2021, (12 more...)

arXiv.org Artificial Intelligence

2406.17716

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Vietnam > Bắc Giang Province > Bắc Giang (0.04)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Add feedback

CLEME2.0: Towards More Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction

Ye, Jingheng, Xu, Zishan, Li, Yinghui, Cheng, Xuxin, Song, Linlin, Zhou, Qingyu, Zheng, Hai-Tao, Shen, Ying, Su, Xin

arXiv.org Artificial IntelligenceJun-30-2024

The paper focuses on improving the interpretability of Grammatical Error Correction (GEC) metrics, which receives little attention in previous studies. To bridge the gap, we propose CLEME2.0, a reference-based evaluation strategy that can describe four elementary dimensions of GEC systems, namely hit-correction, error-correction, under-correction, and over-correction. They collectively contribute to revealing the critical characteristics and locating drawbacks of GEC systems. Evaluating systems by Combining these dimensions leads to high human consistency over other reference-based and reference-less metrics. Extensive experiments on 2 human judgement datasets and 6 reference datasets demonstrate the effectiveness and robustness of our method. All the codes will be released after the peer review.

cleme2, correction, error correction, (15 more...)

arXiv.org Artificial Intelligence

2407.00934

Country:

Europe > Austria > Vienna (0.14)
Asia > Singapore (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(10 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (0.84)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.72)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

EXCGEC: A Benchmark of Edit-wise Explainable Chinese Grammatical Error Correction

Ye, Jingheng, Qin, Shang, Li, Yinghui, Cheng, Xuxin, Qin, Libo, Zheng, Hai-Tao, Xing, Peng, Xu, Zishan, Cheng, Guo, Wei, Zhao

arXiv.org Artificial IntelligenceJun-30-2024

Existing studies explore the explainability of Grammatical Error Correction (GEC) in a limited scenario, where they ignore the interaction between corrections and explanations. To bridge the gap, this paper introduces the task of EXplainable GEC (EXGEC), which focuses on the integral role of both correction and explanation tasks. To facilitate the task, we propose EXCGEC, a tailored benchmark for Chinese EXGEC consisting of 8,216 explanation-augmented samples featuring the design of hybrid edit-wise explanations. We benchmark several series of LLMs in multiple settings, covering post-explaining and pre-explaining. To promote the development of the task, we introduce a comprehensive suite of automatic metrics and conduct human evaluation experiments to demonstrate the human consistency of the automatic metrics for free-text explanations. All the codes and data will be released after the review.

computational linguistic, explanation, grammatical error, (13 more...)

arXiv.org Artificial Intelligence

2407.00924

Country:

Europe > Austria > Vienna (0.14)
Asia > Singapore (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.88)
(2 more...)

Add feedback

eFontes. Part of Speech Tagging and Lemmatization of Medieval Latin Texts.A Cross-Genre Survey

Nowak, Krzysztof, Ziębura, Jędrzej, Wróbel, Krzysztof, Smywiński-Pohl, Aleksander

arXiv.org Artificial IntelligenceJun-29-2024

This study introduces the eFontes models for automatic linguistic annotation of Medieval Latin texts, focusing on lemmatization, part-of-speech tagging, and morphological feature determination. Using the Transformers library, these models were trained on Universal Dependencies (UD) corpora and the newly developed eFontes corpus of Polish Medieval Latin. The research evaluates the models' performance, addressing challenges such as orthographic variations and the integration of Latinized vernacular terms. The models achieved high accuracy rates: lemmatization at 92.60%, part-of-speech tagging at 83.29%, and morphological feature determination at 88.57%. The findings underscore the importance of high-quality annotated corpora and propose future enhancements, including extending the models to Named Entity Recognition.

corpus, proceedings, scenario, (14 more...)

arXiv.org Artificial Intelligence

2407.00418

Country:

Europe > Poland > Lesser Poland Province > Kraków (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > Western Europe (0.04)
(7 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.81)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.54)

Add feedback

RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs

Taktasheva, Ekaterina, Bazhukov, Maxim, Koncha, Kirill, Fenogenova, Alena, Artemova, Ekaterina, Mikhailov, Vladislav

arXiv.org Artificial IntelligenceJun-28-2024

Minimal pairs are a well-established approach to evaluating the grammatical knowledge of language models. However, existing resources for minimal pairs address a limited number of languages and lack diversity of language-specific grammatical phenomena. This paper introduces the Russian Benchmark of Linguistic Minimal Pairs (RuBLiMP), which includes 45k pairs of sentences that differ in grammaticality and isolate a morphological, syntactic, or semantic phenomenon. In contrast to existing benchmarks of linguistic minimal pairs, RuBLiMP is created by applying linguistic perturbations to automatically annotated sentences from open text corpora and carefully curating test data. We describe the data collection protocol and present the results of evaluating 25 language models in various scenarios. We find that the widely used language models for Russian are sensitive to morphological and agreement-oriented contrasts but fall behind humans on phenomena requiring understanding of structural relations, negation, transitivity, and tense. RuBLiMP, the codebase, and other materials are publicly available.

computational linguistic, minimal pair, rublimp, (13 more...)

arXiv.org Artificial Intelligence

2406.19232

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
North America > Dominican Republic (0.04)
(18 more...)

Genre: Research Report > New Finding (0.45)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Detection and Measurement of Syntactic Templates in Generated Text

Shaib, Chantal, Elazar, Yanai, Li, Junyi Jessy, Wallace, Byron C.

arXiv.org Artificial IntelligenceJun-28-2024

Recent work on evaluating the diversity of text generated by LLMs has focused on word-level features. Here we offer an analysis of syntactic features to characterize general repetition in models, beyond frequent n-grams. Specifically, we define syntactic templates and show that models tend to produce templated text in downstream tasks at a higher rate than what is found in human-reference texts. We find that most (76%) templates in model-generated text can be found in pre-training data (compared to only 35% of human-authored text), and are not overwritten during fine-tuning processes such as RLHF. This connection to the pre-training data allows us to analyze syntactic templates in models where we do not have the pre-training data. We also find that templates as features are able to differentiate between models, tasks, and domains, and are useful for qualitatively evaluating common model constructions. Finally, we demonstrate the use of templates as a useful tool for analyzing style memorization of training data in LLMs.

dataset, diversity, template, (12 more...)

arXiv.org Artificial Intelligence

2407.00211

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Asia (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.67)

Industry:

Leisure & Entertainment (0.93)
Media > Film (0.68)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Add feedback

AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries

Saparina, Irina, Lapata, Mirella

arXiv.org Artificial IntelligenceJun-27-2024

Practical semantic parsers are expected to understand user utterances and map them to executable programs, even when these are ambiguous. We introduce a new benchmark, AMBROSIA, which we hope will inform and inspire the development of text-to-SQL parsers capable of recognizing and interpreting ambiguous requests. Our dataset contains questions showcasing three different types of ambiguity (scope ambiguity, attachment ambiguity, and vagueness), their interpretations, and corresponding SQL queries. In each case, the ambiguity persists even when the database context is provided. This is achieved through a novel approach that involves controlled generation of databases from scratch. We benchmark various LLMs on AMBROSIA, revealing that even the most advanced models struggle to identify and interpret ambiguity in questions.

ambiguity, interpretation, sql query, (15 more...)

arXiv.org Artificial Intelligence

2406.19073

Country:

Europe > United Kingdom (0.28)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > Canada > Ontario > Toronto (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Law (0.92)
Government (0.67)
(3 more...)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)

Add feedback

Handling Ontology Gaps in Semantic Parsing

Bacciu, Andrea, Damonte, Marco, Basaldella, Marco, Monti, Emilio

arXiv.org Artificial IntelligenceJun-27-2024

The majority of Neural Semantic Parsing (NSP) models are developed with the assumption that there are no concepts outside the ones such models can represent with their target symbols (closed-world assumption). This assumption leads to generate hallucinated outputs rather than admitting their lack of knowledge. Hallucinations can lead to wrong or potentially offensive responses to users. Hence, a mechanism to prevent this behavior is crucial to build trusted NSP-based Question Answering agents. To that end, we propose the Hallucination Simulation Framework (HSF), a general setting for stimulating and analyzing NSP model hallucinations. The framework can be applied to any NSP task with a closed-ontology. Using the proposed framework and KQA Pro as the benchmark dataset, we assess state-of-the-art techniques for hallucination detection. We then present a novel hallucination detection strategy that exploits the computational graph of the NSP model to detect the NSP hallucinations in the presence of ontology gaps, out-of-domain utterances, and to recognize NSP errors, improving the F1-Score respectively by ~21, ~24% and ~1%. This is the first work in closed-ontology NSP that addresses the problem of recognizing ontology gaps. We release our code and checkpoints at https://github.com/amazon-science/handling-ontology-gaps-in-semantic-parsing.

hallucination, nsp model, utterance, (15 more...)

arXiv.org Artificial Intelligence

2406.19537

Country:

North America > Canada (0.14)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

Schnabel, Tobias, Neville, Jennifer

arXiv.org Artificial IntelligenceJun-27-2024

In many modern LLM applications, such as retrieval augmented generation, prompts have become programs themselves. In these settings, prompt programs are repeatedly called with different user queries or data instances. A big practical challenge is optimizing such prompt programs. Recent work has mostly focused on either simple prompt programs or assumed that the general structure of a prompt program is fixed. We introduce SAMMO, a framework to perform symbolic prompt program search for compile-time optimizations of prompt programs. SAMMO represents prompt programs on a symbolic level which allows for a rich set of transformations that can be searched over during optimization. We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression, across several different LLMs. We make all code available open-source at https://github.com/microsoft/sammo .

instruction, pos tag, prompt program, (16 more...)

arXiv.org Artificial Intelligence

2404.02319

Country:

North America > United States > New York (0.05)
South America > Brazil (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Grammar Assistance Using Syntactic Structures (GAUSS)

Zamaraeva, Olga, Allegue, Lorena S., Gómez-Rodríguez, Carlos, Ogneva, Anastasiia, Alonso-Ramos, Margarita

arXiv.org Artificial IntelligenceJun-26-2024

Automatic grammar coaching serves an important purpose of advising on standard grammar varieties while not imposing social pressures or reinforcing established social roles. Such systems already exist but most of them are for English and few of them offer meaningful feedback. Furthermore, they typically rely completely on neural methods and require huge computational resources which most of the world cannot afford. We propose a grammar coaching system for Spanish that relies on (i) a rich linguistic formalism capable of giving informative feedback; and (ii) a faster parsing algorithm which makes using this formalism practical in a real-world application. The approach is feasible for any language for which there is a computerized grammar and is less reliant on expensive and environmentally costly neural methods. We seek to contribute to Greener AI and to address global education challenges by raising the standards of inclusivity and engagement in grammar coaching.

gauss project, grammar, zamaraeva, (13 more...)

arXiv.org Artificial Intelligence

2406.1834

Country:

Europe > Spain > Galicia > A Coruña Province > A Coruña (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
North America > United States > California > Santa Clara County > Stanford (0.05)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback