AITopics | Marconato, Emanuele

Collaborating Authors

Marconato, Emanuele

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

All or None: Identifiable Linear Properties of Next-token Predictors in Language Modeling

Marconato, Emanuele, Lachapelle, Sébastien, Weichwald, Sebastian, Gresele, Luigi

arXiv.org Machine LearningOct-30-2024

In natural language processing, it is well-established that linear relationships between highdimensional, real-valued vector representations of textual inputs reflect semantic and syntactic patterns. This was motivated in seminal works [4, 5, 6, 7, 8] and extensively validated in word embedding models [9, 10, 11] as well as modern large language models trained for next-token prediction [2, 12, 13, 14, 15, 16, 17, 18, 19]. This ubiquity is puzzling, as different internal representations can produce identical next-token distributions, resulting in distribution-equivalent but internally distinct models. This raises a key question: Are the observed linear properties shared across all models with the same next-token distribution? Our main result is a mathematical proof that, under suitable conditions, certain linear properties hold for either all or none of the equivalent models generating a given next-token distribution. We demonstrate this through three main contributions. The first main contribution (Section 3) is an identifiability result characterizing distributionequivalent next-token predictors. Our result is a generalization of the main theorems by Roeder et al. [3] and Khemakhem et al. [20], relaxing the assumptions of diversity and equal representation dimensionality. This result is of independent interest for research on identifiable representation learning since our analysis is applicable to several discriminative models beyond next-token prediction [3].

large language model, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2410.23501

Country:

Europe (0.28)
North America > United States (0.27)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal

Marconato, Emanuele, Bontempo, Gianpaolo, Ficarra, Elisa, Calderara, Simone, Passerini, Andrea, Teso, Stefano

arXiv.org Artificial IntelligenceDec-19-2023

We initiate the study of Neuro-Symbolic Continual Learning (NeSy-CL), in which the goal is to solve a sequence We introduce Neuro-Symbolic Continual Learning, of neuro-symbolic tasks. As is common in neuro-symbolic where a model has to solve a sequence of (NeSy) prediction (Manhaeve et al., 2018; Xu et al., 2018; neuro-symbolic tasks, that is, it has to map subsymbolic Giunchiglia & Lukasiewicz, 2020; Hoernle et al., 2022; inputs to high-level concepts and compute Ahmed et al., 2022a), the machine is provided prior knowledge predictions by reasoning consistently with relating one or more target labels to symbolic, highlevel prior knowledge. Our key observation is that concepts extracted from sub-symbolic data, and has to neuro-symbolic tasks, although different, often compute a prediction by reasoning over said concepts. The share concepts whose semantics remains stable central challenge of Nesy-CL is that the data distribution over time. Traditional approaches fall short: existing and the knowledge may vary across tasks. E.g., in medical continual strategies ignore knowledge altogether, diagnosis knowledge may encode known relationships between while stock neuro-symbolic architectures possible symptoms and conditions, while different suffer from catastrophic forgetting. We show that tasks are characterized by different distributions of X-ray leveraging prior knowledge by combining neurosymbolic scans, symptoms and conditions. The goal, as in continual architectures with continual strategies learning (CL) (Parisi et al., 2019), is to obtain a model that does help avoid catastrophic forgetting, but also attains high accuracy on new tasks without forgetting what that doing so can yield models affected by reasoning it has already learned under a limited storage budget.

artificial intelligence, knowledge, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.01242

Country:

Europe > Italy (0.28)
North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Not All Neuro-Symbolic Concepts Are Created Equal: Analysis and Mitigation of Reasoning Shortcuts

Marconato, Emanuele, Teso, Stefano, Vergari, Antonio, Passerini, Andrea

arXiv.org Machine LearningDec-18-2023

Neuro-Symbolic (NeSy) predictive models hold the promise of improved compliance with given constraints, systematic generalization, and interpretability, as they allow to infer labels that are consistent with some prior knowledge by reasoning over high-level concepts extracted from sub-symbolic inputs. It was recently shown that NeSy predictors are affected by reasoning shortcuts: they can attain high accuracy but by leveraging concepts with unintended semantics, thus coming short of their promised advantages. Yet, a systematic characterization of reasoning shortcuts and of potential mitigation strategies is missing. This work fills this gap by characterizing them as unintended optima of the learning objective and identifying four key conditions behind their occurrence. Based on this, we derive several natural mitigation strategies, and analyze their efficacy both theoretically and empirically. Our analysis shows reasoning shortcuts are difficult to deal with, casting doubts on the trustworthiness and interpretability of existing NeSy solutions.

artificial intelligence, machine learning, mitigation strategy, (15 more...)

arXiv.org Machine Learning

2305.19951

Country:

Europe > Italy (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation Learning

Marconato, Emanuele, Passerini, Andrea, Teso, Stefano

arXiv.org Artificial IntelligenceSep-14-2023

Focus in Explainable AI is shifting from explanations defined in terms of low-level elements, such as input features, to explanations encoded in terms of interpretable concepts learned from data. How to reliably acquire such concepts is, however, still fundamentally unclear. An agreed-upon notion of concept interpretability is missing, with the result that concepts used by both post-hoc explainers and concept-based neural networks are acquired through a variety of mutually incompatible strategies. Critically, most of these neglect the human side of the problem: a representation is understandable only insofar as it can be understood by the human at the receiving end. The key challenge in Human-interpretable Representation Learning (HRL) is how to model and operationalize this human element. In this work, we propose a mathematical framework for acquiring interpretable representations suitable for both post-hoc explainers and concept-based neural networks. Our formalization of HRL builds on recent advances in causal representation learning and explicitly models a human stakeholder as an external observer. This allows us to derive a principled notion of alignment between the machine representation and the vocabulary of concepts understood by the human. In doing so, we link alignment and interpretability through a simple and intuitive name transfer game, and clarify the relationship between alignment and a well-known property of representations, namely disentanglment. We also show that alignment is linked to the issue of undesirable correlations among concepts, also known as concept leakage, and to content-style separation, all through a general information-theoretic reformulation of these properties. Our conceptualization aims to bridge the gap between the human and algorithmic sides of interpretability and establish a stepping stone for new research on human-interpretable representations.

artificial intelligence, human-interpretable representation learning, machine learning, (4 more...)

arXiv.org Artificial Intelligence

2309.07742

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)

Add feedback

Neuro-Symbolic Reasoning Shortcuts: Mitigation Strategies and their Limitations

Marconato, Emanuele, Teso, Stefano, Passerini, Andrea

arXiv.org Artificial IntelligenceMar-22-2023

Neuro-symbolic predictors learn a mapping from sub-symbolic inputs to higher-level concepts and then carry out (probabilistic) logical inference on this intermediate representation. This setup offers clear advantages in terms of consistency to symbolic prior knowledge, and is often believed to provide interpretability benefits in that - by virtue of complying with the knowledge - the learned concepts can be better understood by human stakeholders. However, it was recently shown that this setup is affected by reasoning shortcuts whereby predictions attain high accuracy by leveraging concepts with unintended semantics, yielding poor out-of-distribution performance and compromising interpretability. In this short paper, we establish a formal link between reasoning shortcuts and the optima of the loss function, and identify situations in which reasoning shortcuts can arise. Based on this, we discuss limitations of natural mitigation strategies such as reconstruction and concept supervision.

artificial intelligence, concept supervision, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2303.12578

Country: Europe > Italy (0.30)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback