AITopics

2211.06366

Country:

Europe > Netherlands (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.31)

Locatelli, Davide, Quattoni, Ariadna

Measuring Alignment Bias in Neural Seq2Seq Semantic Parsers

arXiv.org Artificial IntelligenceNov-11-2022

Prior to deep learning the semantic parsing community has been interested in understanding and modeling the range of possible word alignments between natural language sentences and their corresponding meaning representations. Sequence-to-sequence models changed the research landscape suggesting that we no longer need to worry about alignments since they can be learned automatically by means of an attention mechanism. More recently, researchers have started to question such premise. In this work we investigate whether seq2seq models can handle both simple and complex alignments. To answer this question we augment the popular Geo semantic parsing dataset with alignment annotations and create Geo-Aligned. We then study the performance of standard seq2seq models on the examples that can be aligned monotonically versus examples that require more complex alignments. Our empirical study shows that performance is significantly better over monotonic alignments.

artificial intelligence, machine learning, natural language, (20 more...)

doi: 10.18653/v1/2022.starsem-1.17

2205.08288

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Germany > Berlin (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(13 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

arXiv.org Artificial IntelligenceNov-10-2022

LERT: A Linguistically-motivated Pre-trained Language Model

Cui, Yiming, Che, Wanxiang, Wang, Shijin, Liu, Ting

Pre-trained Language Model (PLM) has become a representative foundation model in the natural language processing field. Most PLMs are trained with linguistic-agnostic pre-training tasks on the surface form of the text, such as the masked language model (MLM). To further empower the PLMs with richer linguistic features, in this paper, we aim to propose a simple but effective way to learn linguistic features for pre-trained language models. We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original MLM pre-training task, using a linguistically-informed pre-training (LIP) strategy. We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements over various comparable baselines. Furthermore, we also conduct analytical experiments in various linguistic aspects, and the results prove that the design of LERT is valid and effective. Resources are available at https://github.com/ymcui/LERT

computational linguistic, machine learning, natural language, (20 more...)

2211.05344

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Heilongjiang Province > Harbin (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(9 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

arXiv.org Artificial IntelligenceNov-9-2022

Uni-Parser: Unified Semantic Parser for Question Answering on Knowledge Base and Database

Liu, Ye, Yavuz, Semih, Meng, Rui, Radev, Dragomir, Xiong, Caiming, Zhou, Yingbo

Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB). However, existing approaches on semantic parsing cannot adapt to both modalities, as they suffer from the exponential growth of the logical form candidates and can hardly generalize to unseen data. In this work, we propose Uni-Parser, a unified semantic parser for question answering (QA) on both KB and DB. We introduce the primitive (relation and entity in KB, and table name, column name and cell value in DB) as an essential element in our framework. The number of primitives grows linearly with the number of retrieved relations in KB and DB, preventing us from dealing with exponential logic form candidates. We leverage the generator to predict final logical forms by altering and composing topranked primitives with different operations (e.g. select, where, count). With sufficiently pruned search space by a contrastive primitive ranker, the generator is empowered to capture the composition of primitives enhancing its generalization ability. We achieve competitive results on multiple KB and DB QA benchmarks more efficiently, especially in the compositional and zero-shot settings.

artificial intelligence, logical form, natural language, (16 more...)

2211.05165

Country:

Europe > Slovakia (0.14)
North America > United States > New York (0.04)
Europe > Czechia (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Syvokon, Oleksiy, Nahorna, Olena

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

We present a corpus professionally annotated for grammatical error correction (GEC) and fluency edits in the Ukrainian language. To the best of our knowledge, this is the first GEC corpus for the Ukrainian language. We collected texts with errors (20,715 sentences) from a diverse pool of contributors, including both native and non-native speakers. The data cover a wide variety of writing domains, from text chats and essays to formal writing. Professional proofreaders corrected and annotated the corpus for errors relating to fluency, grammar, punctuation, and spelling. This corpus can be used for developing and evaluating GEC systems in Ukrainian. More generally, it can be used for researching multilingual and low-resource NLP, morphologically rich languages, document-level GEC, and fluency correction. The corpus is publicly available at https://github.com/grammarly/ua-gec

artificial intelligence, data quality, natural language, (16 more...)

2103.16997

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
(11 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.88)
Information Technology > Data Science > Data Quality > Data Cleaning (0.65)

Mohammadi, Hassan Haji, Talebpour, Alireza, Aznaveh, Ahmad Mahmoudi, Yazdani, Samaneh

Review of coreference resolution in English and Persian

Coreference resolution (CR) is one of the most challenging areas of natural language processing. This task seeks to identify all textual references to the same real-world entity. Research in this field is divided into coreference resolution and anaphora resolution. Due to its application in textual comprehension and its utility in other tasks such as information extraction systems, document summarization, and machine translation, this field has attracted considerable interest. Consequently, it has a significant effect on the quality of these systems. This article reviews the existing corpora and evaluation metrics in this field. Then, an overview of the coreference algorithms, from rule-based methods to the latest deep learning techniques, is provided. Finally, coreference resolution and pronoun resolution systems in Persian are investigated.

artificial intelligence, machine learning, natural language, (18 more...)

2211.04428

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
North America > United States > Massachusetts (0.04)
Indian Ocean > Arabian Gulf (0.04)
(10 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Perspectives on neural proof nets

Moot, Richard

Proof nets are a way of representing proofs as a type of (hyper)graph. Originally introduced for linear logic (Girard 1987), proof nets can be seen as a parallelised sequent calculus which removes inessential rule permutations, but also as a multi-conclusion natural deduction which simplifies many of the logical rules (notably the E, E, E rules). This make proof nets a good choice for automated theorem proving: avoiding needless rule permutations entails an important reduction of the search space for proofs (compared to sequent calculus, and to a somewhat lesser extent when compared to natural deduction) but still allows us to compute the lambda terms corresponding to our proofs: enumerating all different proof nets for a sequent is equivalent to enumerating all its different lambda terms. Proof nets can be adapted to different types of type-logical grammars while preserving their good logical properties (Moot 2021). This makes them an important tool for testing the predictions of different grammars written in typelogical formalisms.

logic & formal reasoning, machine learning, vertex, (20 more...)

2211.04141

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Complex Knowledge Base Question Answering: A Survey

Lan, Yunshi, He, Gaole, Jiang, Jinhao, Jiang, Jing, Zhao, Wayne Xin, Wen, Ji-Rong

Knowledge base question answering (KBQA) aims to answer a question over a knowledge base (KB). Early studies mainly focused on answering simple questions over KBs and achieved great success. However, their performance on complex questions is still far from satisfactory. Therefore, in recent years, researchers propose a large number of novel methods, which looked into the challenges of answering complex questions. In this survey, we review recent advances on KBQA with the focus on solving complex questions, which usually contain multiple subjects, express compound relations, or involve numerical operations. In detail, we begin with introducing the complex KBQA task and relevant background. Then, we describe benchmark datasets for complex KBQA task and introduce the construction process of these datasets. Next, we present two mainstream categories of methods for complex KBQA, namely semantic parsing-based (SP-based) methods and information retrieval-based (IR-based) methods. Specifically, we illustrate their procedures with flow designs and discuss their major differences and similarities. After that, we summarize the challenges that these two categories of methods encounter when answering complex questions, and explicate advanced solutions and techniques used in existing work. Finally, we conclude and discuss several promising directions related to complex KBQA for future research.

knowledge management, machine learning, question answering, (22 more...)

2108.06688

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Singapore (0.04)

Genre:

Overview (1.00)
Workflow (0.92)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
(3 more...)

Soni, Sarvesh, Roberts, Kirk

Toward a Neural Semantic Parsing System for EHR Question Answering

Clinical semantic parsing (SP) is an important step toward identifying the exact information need (as a machine-understandable logical form) from a natural language query aimed at retrieving information from electronic health records (EHRs). Current approaches to clinical SP are largely based on traditional machine learning and require hand-building a lexicon. The recent advancements in neural SP show a promise for building a robust and flexible semantic parser without much human effort. Thus, in this paper, we aim to systematically assess the performance of two such neural SP models for EHR question answering (QA). We found that the performance of these advanced neural models on two clinical SP datasets is promising given their ease of application and generalizability. Our error analysis surfaces the common types of errors made by these models and has the potential to inform future research into improving the performance of neural SP models for EHR QA.

artificial intelligence, machine learning, natural language, (20 more...)