AITopics

2212.068

Country:

North America > United States > Mississippi (0.05)
North America > United States > Texas (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(17 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Vasylenko, Pavlo, Cabot, Pere-Lluís Huguet, Lorenzo, Abelardo Carlos Martínez, Navigli, Roberto

Incorporating Graph Information in Transformer-based AMR Parsing

arXiv.org Artificial IntelligenceJun-23-2023

Abstract Meaning Representation (AMR) is a Semantic Parsing formalism that aims at providing a semantic graph abstraction representing a given text. Current approaches are based on autoregressive language models such as BART or T5, fine-tuned through Teacher Forcing to obtain a linearized version of the AMR graph from a sentence. In this paper, we present LeakDistill, a model and method that explores a modification to the Transformer architecture, using structural adapters to explicitly incorporate graph information into the learned representations and improve AMR parsing performance. Our experiments show how, by employing word-to-node alignment to embed graph structural information into the encoder at training time, we can obtain state-of-the-art AMR parsing through self-knowledge distillation, even without the use of additional data. We release the code at \url{http://www.github.com/sapienzanlp/LeakDistill}.

computational linguistic, machine learning, natural language, (18 more...)

2306.13467

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
(19 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Yueh, Chieling, Kanoulas, Evangelos, Martins, Bruno, Thorne, Camilo, Akhondi, Saber

Stress Testing BERT Anaphora Resolution Models for Reaction Extraction in Chemical Patents

arXiv.org Artificial IntelligenceJun-23-2023

The high volume of published chemical patents and the importance of a timely acquisition of their information gives rise to automating information extraction from chemical patents. Anaphora resolution is an important component of comprehensive information extraction, and is critical for extracting reactions. In chemical patents, there are five anaphoric relations of interest: co-reference, transformed, reaction associated, work up, and contained. Our goal is to investigate how the performance of anaphora resolution models for reaction texts in chemical patents differs in a noise-free and noisy environment and to what extent we can improve the robustness against noise of the model.

data mining, machine learning, natural language, (16 more...)

2306.13379

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.83)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.69)
Information Technology > Data Science > Data Mining > Text Mining (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

AlShuhayeb, Huda, Minaei-Bidgoli, Behrouz, Shenassa, Mohammad E., Hossayni, Sayyed-Ali

Noor-Ghateh: A Benchmark Dataset for Evaluating Arabic Word Segmenters in Hadith Domain

arXiv.org Artificial IntelligenceJun-22-2023

There are many complex and rich morphological subtleties in the Arabic language, which are very useful when analyzing traditional Arabic texts, especially in the historical and religious contexts, and help in understanding the meaning of the texts. Vocabulary separation means separating the word into different parts such as root and affix. In the morphological datasets, the variety of labels and the number of data samples helps to evaluate the morphological methods. In this paper, we present a benchmark data set for evaluating the methods of separating Arabic words which include about 223,690 words from the book of Sharia alIslam, which have been labeled by experts. In terms of the volume and variety of words, this dataset is superior to other existing data sets, and as far as we know, there are no Arabic Hadith Domain texts. To evaluate the dataset, we applied different methods such as Farasa, Camel, Madamira, and ALP to the dataset and we reported the annotation quality through four evaluation methods.

artificial intelligence, machine learning, natural language, (19 more...)

2307.0963

Country:

Europe > Czechia > Prague (0.05)
Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
North America > United States > Pennsylvania (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Jumelet, Jaap, Zuidema, Willem

Feature Interactions Reveal Linguistic Structure in Language Models

arXiv.org Artificial IntelligenceJun-21-2023

We study feature interactions in the context of feature attribution methods for post-hoc interpretability. In interpretability research, getting to grips with feature interactions is increasingly recognised as an important challenge, because interacting features are key to the success of neural networks. Feature interactions allow a model to build up hierarchical representations for its input, and might provide an ideal starting point for the investigation into linguistic structure in language models. However, uncovering the exact role that these interactions play is also difficult, and a diverse range of interaction attribution methods has been proposed. In this paper, we focus on the question which of these methods most faithfully reflects the inner workings of the target models. We work out a grey box methodology, in which we train models to perfection on a formal language classification task, using PCFGs. We show that under specific configurations, some methods are indeed able to uncover the grammatical rules acquired by a model. Based on these findings we extend our evaluation to a case study on language models, providing novel insights into the linguistic structure that these models have acquired.

artificial intelligence, machine learning, natural language, (20 more...)

2306.12181

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(15 more...)

Genre: Research Report > New Finding (0.47)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.88)

arXiv.org Artificial IntelligenceJun-21-2023

A Semi-Autoregressive Graph Generative Model for Dependency Graph Parsing

Ma, Ye, Sun, Mingming, Li, Ping

Recent years have witnessed the impressive progress in Neural Dependency Parsing. According to the different factorization approaches to the graph joint probabilities, existing parsers can be roughly divided into autoregressive and non-autoregressive patterns. The former means that the graph should be factorized into multiple sequentially dependent components, then it can be built up component by component. And the latter assumes these components to be independent so that they can be outputted in a one-shot manner. However, when treating the directed edge as an explicit dependency relationship, we discover that there is a mixture of independent and interdependent components in the dependency graph, signifying that both aforementioned models fail to precisely capture the explicit dependencies among nodes and edges. Based on this property, we design a Semi-Autoregressive Dependency Parser to generate dependency graphs via adding node groups and edge groups autoregressively while pouring out all group elements in parallel. The model gains a trade-off between non-autoregression and autoregression, which respectively suffer from the lack of target inter-dependencies and the uncertainty of graph generation orders. The experiments show the proposed parser outperforms strong baselines on Enhanced Universal Dependencies of multiple languages, especially achieving $4\%$ average promotion at graph-level accuracy. Also, the performances of model variations show the importance of specific parts.

artificial intelligence, natural language, node, (19 more...)

2306.12018

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Hong Kong (0.04)
(14 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.82)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.76)

Tasnim, Mashrura, Ehghaghi, Malikeh, Diep, Brian, Novikova, Jekaterina

DEPAC: a Corpus for Depression and Anxiety Detection from Speech

Mental distress like depression and anxiety contribute to the largest proportion of the global burden of diseases. Automated diagnosis systems of such disorders, empowered by recent innovations in Artificial Intelligence, can pave the way to reduce the sufferings of the affected individuals. Development of such systems requires information-rich and balanced corpora. In this work, we introduce a novel mental distress analysis audio dataset DEPAC, labeled based on established thresholds on depression and anxiety standard screening tools. This large dataset comprises multiple speech tasks per individual, as well as relevant demographic information. Alongside, we present a feature set consisting of hand-curated acoustic and linguistic features, which were found effective in identifying signs of mental illnesses in human speech. Finally, we justify the quality and effectiveness of our proposed audio corpus and feature set in predicting depression severity by comparing the performance of baseline machine learning models built on this dataset with baseline models trained on other well-known depression corpora.

artificial intelligence, machine learning, natural language, (18 more...)

2306.12443

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Consumer Health (0.93)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Khakhar, Adam, Mell, Stephen, Bastani, Osbert

PAC Prediction Sets for Large Language Models of Code

Prediction sets have recently been shown to be a promising strategy for quantifying the uncertainty of deep neural networks in a way that provides theoretical guarantees. However, existing techniques have largely targeted settings where the space of labels is simple, so prediction sets can be arbitrary subsets of labels. For structured prediction problems where the space of labels is exponential in size, even prediction sets containing a small fraction of all labels can be exponentially large. In the context of code generation, we propose a solution that considers a restricted set of prediction sets that can compactly be represented as partial programs, which are programs with portions replaced with holes. Given a trained code generation model, our algorithm leverages a programming language's abstract syntax tree to generate a set of programs such that the correct program is in the set with high-confidence. Valuable applications of our algorithm include a Codex-style code generator with holes in uncertain parts of the generated code, which provides a partial program with theoretical guarantees. We evaluate our approach on PICARD (a T5 model for SQL semantic parsing) and Codex (a GPT model for over a dozen programming languages, including Python), demonstrating that our approach generates compact PAC prediction sets. This is the first research contribution that generates PAC prediction sets for generative code models.

large language model, machine learning, natural language, (17 more...)

2302.08703

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Wang, Zi, Hershcovich, Daniel

On Evaluating Multilingual Compositional Generalization with Translated Datasets

Compositional generalization allows efficient learning and human-like inductive biases. Since most research investigating compositional generalization in NLP is done on English, important questions remain underexplored. Do the necessary compositional generalization abilities differ across languages? Can models compositionally generalize cross-lingually? As a first step to answering these questions, recent work used neural machine translation to translate datasets for evaluating compositional generalization in semantic parsing. However, we show that this entails critical semantic distortion. To address this limitation, we craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese. Even with the resulting robust benchmark, which we call MCWQ-R, we show that the distribution of compositions still suffers due to linguistic divergences, and that multilingual models still struggle with cross-lingual compositional generalization. Our dataset and methodology will be useful resources for the study of cross-lingual compositional generalization in other tasks.

artificial intelligence, computational linguistic, natural language, (16 more...)

2306.1142

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
(15 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Compositional Exemplars for In-context Learning

Ye, Jiacheng, Wu, Zhiyong, Feng, Jiangtao, Yu, Tao, Kong, Lingpeng

Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability, where the model learns to do an unseen task via a prompt consisting of input-output examples as the demonstration, without any parameter updates. The performance of ICL is highly dominated by the quality of the selected in-context examples. However, previous selection methods are mostly based on simple heuristics, leading to sub-optimal performance. In this work, we formulate in-context example selection as a subset selection problem. We propose CEIL (Compositional Exemplars for In-context Learning), which is instantiated by Determinantal Point Processes (DPPs) to model the interaction between the given input and in-context examples, and optimized through a carefully-designed contrastive learning objective to obtain preference from LMs. We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing. Extensive experiments demonstrate not only the state-of-the-art performance but also the transferability and compositionality of CEIL, shedding new light on effective and efficient in-context learning. Our code is released at https://github.com/HKUNLP/icl-ceil.

large language model, machine learning, natural language, (17 more...)

2302.05698

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Montana (0.04)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)