AITopics | Fierro, Constanza

Collaborating Authors

Fierro, Constanza

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How Do Multilingual Models Remember? Investigating Multilingual Factual Recall Mechanisms

Fierro, Constanza, Foroutan, Negar, Elliott, Desmond, Søgaard, Anders

arXiv.org Artificial IntelligenceOct-18-2024

Large Language Models (LLMs) store and retrieve vast amounts of factual knowledge acquired during pre-training. Prior research has localized and identified mechanisms behind knowledge recall; however, it has primarily focused on English monolingual models. The question of how these processes generalize to other languages and multilingual LLMs remains unexplored. In this paper, we address this gap by conducting a comprehensive analysis of two highly multilingual LLMs. We assess the extent to which previously identified components and mechanisms of factual recall in English apply to a multilingual context. Then, we examine when language plays a role in the recall process, uncovering evidence of language-independent and language-dependent mechanisms.

large language model, machine learning, subj, (18 more...)

arXiv.org Artificial Intelligence

2410.14387

Country:

Europe (1.00)
Asia (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Defining Knowledge: Bridging Epistemology and Large Language Models

Fierro, Constanza, Dhar, Ruchira, Stamatiou, Filippos, Garneau, Nicolas, Søgaard, Anders

arXiv.org Artificial IntelligenceOct-3-2024

Knowledge claims are abundant in the literature on large language models (LLMs); but can we say that GPT-4 truly "knows" the Earth is round? To address this question, we review standard definitions of knowledge in epistemology and we formalize interpretations applicable to LLMs. In doing so, we identify inconsistencies and gaps in how current NLP research conceptualizes knowledge with respect to epistemological frameworks. Additionally, we conduct a survey of 100 professional philosophers and computer scientists to compare their preferences in knowledge definitions and their views on whether LLMs can really be said to know. Finally, we suggest evaluation protocols for testing knowledge in accordance to the most relevant definitions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.02499

Country:

Europe > Germany (0.15)
North America > Canada (0.14)
Asia > China (0.14)
(4 more...)

Genre:

Overview (0.69)
Questionnaire & Opinion Survey (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Does Instruction Tuning Make LLMs More Consistent?

Fierro, Constanza, Li, Jiaang, Søgaard, Anders

arXiv.org Artificial IntelligenceApr-30-2024

The purpose of instruction tuning is enabling zero-shot performance, but instruction tuning has also been shown to improve chain-of-thought reasoning and value alignment (Si et al., 2023). Here we consider the impact on $\textit{consistency}$, i.e., the sensitivity of language models to small perturbations in the input. We compare 10 instruction-tuned LLaMA models to the original LLaMA-7b model and show that almost across-the-board they become more consistent, both in terms of their representations and their predictions in zero-shot and downstream tasks. We explain these improvements through mechanistic analyses of factual recall.

consistency, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.15206

Country:

Europe (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MuLan: A Study of Fact Mutability in Language Models

Fierro, Constanza, Garneau, Nicolas, Bugliarello, Emanuele, Kementchedjhieva, Yova, Søgaard, Anders

arXiv.org Artificial IntelligenceApr-3-2024

Facts are subject to contingencies and can be true or false in different circumstances. One such contingency is time, wherein some facts mutate over a given period, e.g., the president of a country or the winner of a championship. Trustworthy language models ideally identify mutable facts as such and process them accordingly. We create MuLan, a benchmark for evaluating the ability of English language models to anticipate time-contingency, covering both 1:1 and 1:N relations. We hypothesize that mutable facts are encoded differently than immutable ones, hence being easier to update. In a detailed evaluation of six popular large language models, we consistently find differences in the LLMs' confidence, representations, and update behavior, depending on the mutability of a fact. Our findings should inform future work on the injection of and induction of time-contingent knowledge to/from LLMs.

large language model, machine learning, relation, (19 more...)

arXiv.org Artificial Intelligence

2404.03036

Country:

Europe > Germany (0.14)
Asia > China (0.14)
North America > United States > Texas (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

$\mu$PLAN: Summarizing using a Content Plan as Cross-Lingual Bridge

Huot, Fantine, Maynez, Joshua, Alberti, Chris, Amplayo, Reinald Kim, Agrawal, Priyanka, Fierro, Constanza, Narayan, Shashi, Lapata, Mirella

arXiv.org Artificial IntelligenceMay-23-2023

Cross-lingual summarization consists of generating a summary in one language given an input document in a different language, allowing for the dissemination of relevant content across speakers of other languages. However, this task remains challenging, mainly because of the need for cross-lingual datasets and the compounded difficulty of summarizing and translating. This work presents $\mu$PLAN, an approach to cross-lingual summarization that uses an intermediate planning step as a cross-lingual bridge. We formulate the plan as a sequence of entities that captures the conceptualization of the summary, i.e. identifying the salient content and expressing in which order to present the information, separate from the surface form. Using a multilingual knowledge base, we align the entities to their canonical designation across languages. $\mu$PLAN models first learn to generate the plan and then continue generating the summary conditioned on the plan and the input. We evaluate our methodology on the XWikis dataset on cross-lingual pairs across four languages and demonstrate that this planning objective achieves state-of-the-art performance in terms of ROUGE and faithfulness scores. Moreover, this planning approach improves the zero-shot transfer to new cross-lingual language pairs compared to non-planning baselines.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.14205

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.29)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

Factual Consistency of Multilingual Pretrained Language Models

Fierro, Constanza, Søgaard, Anders

arXiv.org Artificial IntelligenceMar-22-2022

Pretrained language models can be queried for factual knowledge, with potential applications in knowledge base acquisition and tasks that require inference. However, for that, we need to know how reliable this knowledge is, and recent work has shown that monolingual English language models lack consistency when predicting factual knowledge, that is, they fill-in-the-blank differently for paraphrases describing the same fact. In this paper, we extend the analysis of consistency to a multilingual setting. We introduce a resource, mParaRel, and investigate (i) whether multilingual language models such as mBERT and XLM-R are more consistent than their monolingual counterparts; and (ii) if such models are equally consistent across languages. We find that mBERT is as inconsistent as English BERT in English paraphrases, but that both mBERT and XLM-R exhibit a high degree of inconsistency in English and even more so for all the other 45 languages.

artificial intelligence, computational linguistic, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2022.findings-acl.240

2203.11552

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.89)

Add feedback