AITopics | Sarti, Gabriele

Collaborating Authors

Sarti, Gabriele

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

Qi, Jirui, Sarti, Gabriele, Fernández, Raquel, Bisazza, Arianna

arXiv.org Artificial IntelligenceJul-1-2024

Ensuring the verifiability of model answers is a fundamental challenge for retrieval-augmented generation (RAG) in the question answering (QA) domain. Recently, self-citation prompting was proposed to make large language models (LLMs) generate citations to supporting documents along with their answers. However, self-citing LLMs often struggle to match the required format, refer to non-existent sources, and fail to faithfully reflect LLMs' context usage throughout the generation. In this work, we present MIRAGE --Model Internals-based RAG Explanations -- a plug-and-play approach using model internals for faithful answer attribution in RAG applications. MIRAGE detects context-sensitive answer tokens and pairs them with retrieved documents contributing to their prediction via saliency methods. We evaluate our proposed approach on a multilingual extractive QA dataset, finding high agreement with human answer attribution. On open-ended QA, MIRAGE achieves citation quality and efficiency comparable to self-citation while also allowing for a finer-grained control of attribution parameters. Our qualitative evaluation highlights the faithfulness of MIRAGE's attributions and underscores the promising application of model internals for RAG answer attribution.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.13663

Country:

North America > United States (0.46)
Asia > Middle East > UAE (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Transportation > Air (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

IT5: Text-to-text Pretraining for Italian Language Understanding and Generation

Sarti, Gabriele, Nissim, Malvina

arXiv.org Artificial IntelligenceMay-20-2024

We introduce IT5, the first family of encoder-decoder transformer models pretrained specifically on Italian. We document and perform a thorough cleaning procedure for a large Italian corpus and use it to pretrain four IT5 model sizes. We then introduce the ItaGen benchmark, which includes a broad range of natural language understanding and generation tasks for Italian, and use it to evaluate the performance of IT5 models and multilingual baselines. We find monolingual IT5 models to provide the best scale-to-performance ratio across tested models, consistently outperforming their multilingual counterparts and setting a new state-of-the-art for Italian language generation.

computational linguistic, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2203.03759

Country:

Asia > Middle East > Israel (0.14)
North America > United States > Louisiana (0.14)
Europe > Italy > Molise (0.14)
(2 more...)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.35)

Add feedback

A Primer on the Inner Workings of Transformer-based Language Models

Ferrando, Javier, Sarti, Gabriele, Bisazza, Arianna, Costa-jussà, Marta R.

arXiv.org Artificial IntelligenceMay-1-2024

The rapid progress of research aimed at interpreting the inner workings of advanced language models has highlighted a need for contextualizing the insights gained from years of work in this area. This primer provides a concise technical introduction to the current techniques used to interpret the inner workings of Transformer-based language models, focusing on the generative decoder-only architecture. We conclude by presenting a comprehensive overview of the known internal mechanisms implemented by these models, uncovering connections across popular approaches and active research directions in this area.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.00208

Country:

Europe (1.00)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Health & Medicine (0.67)
Government > Regional Government > North America Government > United States Government (0.45)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers

Langedijk, Anna, Mohebbi, Hosein, Sarti, Gabriele, Zuidema, Willem, Jumelet, Jaap

arXiv.org Artificial IntelligenceOct-5-2023

In recent years, many interpretability methods have been proposed to help interpret the internal states of Transformer-models, at different levels of precision and complexity. Here, to analyze encoder-decoder Transformers, we propose a simple, new method: DecoderLens. Inspired by the LogitLens (for decoder-only Transformers), this method involves allowing the decoder to cross-attend representations of intermediate encoder layers instead of using the final encoder output, as is normally done in encoder-decoder models. The method thus maps previously uninterpretable vector representations to human-interpretable sequences of words or symbols. We report results from the DecoderLens applied to models trained on question answering, logical reasoning, speech recognition and machine translation. The DecoderLens reveals several specific subtasks that are solved at low or intermediate layers, shedding new light on the information flow inside the encoder component of this important class of models.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2310.03686

Country:

Europe (1.00)
Asia > Middle East > Republic of Türkiye (1.00)
North America > Canada (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Government > Regional Government > Asia Government > Middle East Government > Republic of Türkiye Government (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.66)

Add feedback

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

Sarti, Gabriele, Chrupała, Grzegorz, Nissim, Malvina, Bisazza, Arianna

arXiv.org Artificial IntelligenceOct-2-2023

Establishing whether language models can use contextual information in a human-plausible way is important to ensure their safe adoption in real-world settings. However, the questions of when and which parts of the context affect model generations are typically tackled separately, and current plausibility evaluations are practically limited to a handful of artificial benchmarks. To address this, we introduce Plausibility Evaluation of Context Reliance (PECoRe), an end-to-end interpretability framework designed to quantify context usage in language models' generations. Our approach leverages model internals to (i) contrastively identify context-sensitive target tokens in generated texts and (ii) link them to contextual cues justifying their prediction. We use PECoRe to quantify the plausibility of context-aware machine translation models, comparing model rationales with human annotations across several discourse-level phenomena. Finally, we apply our method to unannotated generations to identify context-mediated predictions and highlight instances of (im)plausible context usage in model translations.

artificial intelligence, computational linguistic, natural language, (13 more...)

arXiv.org Artificial Intelligence

2310.01188

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Let the Models Respond: Interpreting Language Model Detoxification Through the Lens of Prompt Dependence

Scalena, Daniel, Sarti, Gabriele, Nissim, Malvina, Fersini, Elisabetta

arXiv.org Artificial IntelligenceSep-1-2023

Due to language models' propensity to generate toxic or hateful responses, several techniques were developed to align model generations with users' preferences. Despite the effectiveness of such methods in improving the safety of model interactions, their impact on models' internal processes is still poorly understood. In this work, we apply popular detoxification approaches to several language models and quantify their impact on the resulting models' prompt dependence using feature attribution methods. We evaluate the effectiveness of counter-narrative fine-tuning and compare it with reinforcement learning-driven detoxification, observing differences in prompt reliance between the two methods despite their similar detoxification performances.

artificial intelligence, interpreting language model detoxification, prompt dependence, (1 more...)

arXiv.org Artificial Intelligence

2309.00751

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

Add feedback

Inseq: An Interpretability Toolkit for Sequence Generation Models

Sarti, Gabriele, Feldhus, Nils, Sickert, Ludwig, van der Wal, Oskar, Nissim, Malvina, Bisazza, Arianna

arXiv.org Artificial IntelligenceMay-27-2023

Past work in natural language processing interpretability focused mainly on popular classification tasks while largely overlooking generation settings, partly due to a lack of dedicated tools. In this work, we introduce Inseq, a Python library to democratize access to interpretability analyses of sequence generation models. Inseq enables intuitive and optimized extraction of models' internal information and feature importance scores for popular decoder-only and encoder-decoder Transformers architectures. We showcase its potential by adopting it to highlight gender biases in machine translation models and locate factual knowledge inside GPT-2. Thanks to its extensible interface supporting cutting-edge techniques such as contrastive feature attribution, Inseq can drive future advances in explainable natural language generation, centralizing good practices and enabling fair and reproducible model evaluations.

large language model, machine learning, natural language, (6 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.acl-demo.40

2302.13942

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

Sarti, Gabriele, Htut, Phu Mon, Niu, Xing, Hsu, Benjamin, Currey, Anna, Dinu, Georgiana, Nadejde, Maria

arXiv.org Artificial IntelligenceMay-26-2023

Attribute-controlled translation (ACT) is a subtask of machine translation that involves controlling stylistic or linguistic attributes (like formality and gender) of translation outputs. While ACT has garnered attention in recent years due to its usefulness in real-world applications, progress in the task is currently limited by dataset availability, since most prior approaches rely on supervised methods. To address this limitation, we propose Retrieval and Attribute-Marking enhanced Prompting (RAMP), which leverages large multilingual language models to perform ACT in few-shot and zero-shot settings. RAMP improves generation accuracy over the standard prompting approach by (1) incorporating a semantic similarity retrieval component for selecting similar in-context examples, and (2) marking in-context examples with attribute annotations. Our comprehensive experiments show that RAMP is a viable approach in both zero-shot and few-shot settings.

artificial intelligence, large language model, natural language, (4 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.acl-short.126

2305.17131

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.53)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.44)

Add feedback

Are Character-level Translations Worth the Wait? Comparing Character- and Subword-level Models for Machine Translation

Edman, Lukas, Sarti, Gabriele, Toral, Antonio, van Noord, Gertjan, Bisazza, Arianna

arXiv.org Artificial IntelligenceMay-11-2023

Pretrained character-level language models were recently shown to be competitive with popular subword models across a range of NLP tasks. However, there has been little research on their effectiveness for neural machine translation (NMT). This work performs an extensive comparison across multiple languages and experimental conditions of state-of-the-art character- and subword-level pre-trained models (ByT5 and mT5, respectively) on NMT, showing the effectiveness of character-level modeling in translation, particularly in cases where training data is limited. In our analysis, we show how character models' performance gains are reflected in better translations of orthographically similar words and rare words. While evaluating the importance of source texts in driving model predictions, we highlight ByT5 word-level patterns suggesting an ability to modulate word and character-level information during the translation, providing insights into a potential weakness of character-level modeling. We conclude by assessing the efficiency tradeoff of character models, suggesting their usage in non-time-critical scenarios to boost translation quality.

artificial intelligence, machine translation, natural language, (16 more...)

arXiv.org Artificial Intelligence

2302.1422

Country:

Europe (1.00)
Asia (0.68)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages

Sarti, Gabriele, Bisazza, Arianna, Arenas, Ana Guerberof, Toral, Antonio

arXiv.org Artificial IntelligenceOct-18-2022

We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. Using a strictly controlled setup, 18 professional translators were instructed to translate or post-edit the same set of English documents into Arabic, Dutch, Italian, Turkish, Ukrainian, and Vietnamese. During the process, their edits, keystrokes, editing times and pauses were recorded, enabling an in-depth, cross-lingual evaluation of NMT quality and post-editing effectiveness. Using this new dataset, we assess the impact of two state-of-the-art NMT systems, Google Translate and the multilingual mBART-50 model, on translation productivity. We find that post-editing is consistently faster than translation from scratch. However, the magnitude of productivity gains varies widely across systems and languages, highlighting major disparities in post-editing effectiveness for languages at different degrees of typological relatedness to English, even when controlling for system architecture and training data size. We publicly release the complete dataset including all collected behavioral data, to foster new research on the translation capabilities of NMT systems for typologically diverse languages.

artificial intelligence, machine translation, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.emnlp-main.532

2205.12215

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback