Teso, Stefano
Beyond Topological Self-Explainable GNNs: A Formal Explainability Perspective
Azzolin, Steve, Malhotra, Sagar, Passerini, Andrea, Teso, Stefano
Self-Explainable Graph Neural Networks (SE-GNNs) are popular explainable-by-design GNNs, but the properties and the limitations of their explanations are not well understood. Our first contribution fills this gap by formalizing the explanations extracted by SE-GNNs, referred to as Trivial Explanations (TEs), and comparing them to established notions of explanations, namely Prime Implicant (PI) and faithful explanations. Our analysis reveals that TEs match PI explanations for a restricted but significant family of tasks. In general, however, they can be less informative than PI explanations and are surprisingly misaligned with widely accepted notions of faithfulness. Although faithful and PI explanations are informative, they are intractable to find and we show that they can be prohibitively large. Motivated by this, we propose Dual-Channel GNNs that integrate a white-box rule extractor and a standard SE-GNN, adaptively combining both channels when the task benefits. Our experiments show that even a simple instantiation of Dual-Channel GNNs can recover succinct rules and perform on par or better than widely used SE-GNNs. Our code can be found in the supplementary material.
Time Can Invalidate Algorithmic Recourse
De Toni, Giovanni, Teso, Stefano, Lepri, Bruno, Passerini, Andrea
Algorithmic Recourse (AR) aims to provide users with actionable steps to overturn unfavourable decisions made by machine learning predictors. However, these actions often take time to implement (e.g., getting a degree can take years), and their effects may vary as the world evolves. Thus, it is natural to ask for recourse that remains valid in a dynamic environment. In this paper, we study the robustness of algorithmic recourse over time by casting the problem through the lens of causality. We demonstrate theoretically and empirically that (even robust) causal AR methods can fail over time except in the - unlikely - case that the world is stationary. Even more critically, unless the world is fully deterministic, counterfactual AR cannot be solved optimally. To account for this, we propose a simple yet effective algorithm for temporal AR that explicitly accounts for time. Our simulations on synthetic and realistic datasets show how considering time produces more resilient solutions to potential trends in the data distribution.
Logically Consistent Language Models via Neuro-Symbolic Integration
Calanzone, Diego, Teso, Stefano, Vergari, Antonio
Large language models (LLMs) are a promising venue for natural language understanding and generation. However, current LLMs are far from reliable: they are prone to generating non-factual information and, more crucially, to contradicting themselves when prompted to reason about relations between entities of the world. These problems are currently addressed with large scale fine-tuning or by delegating reasoning to external tools. In this work, we strive for a middle ground and introduce a loss based on neuro-symbolic reasoning that teaches an LLM to be logically consistent with an external set of facts and rules and improves self-consistency even when the LLM is fine-tuned on a limited set of facts. Our approach also allows to easily combine multiple logical constraints at once in a principled way, delivering LLMs that are more consistent w.r.t. all constraints and improve over several baselines w.r.t. a given constraint. Moreover, our method allows LLMs to extrapolate to unseen but semantically similar factual knowledge, represented in unseen datasets, more systematically.
Semantic Loss Functions for Neuro-Symbolic Structured Prediction
Ahmed, Kareem, Teso, Stefano, Morettin, Paolo, Di Liello, Luca, Ardino, Pierfrancesco, Gobbi, Jacopo, Liang, Yitao, Wang, Eric, Chang, Kai-Wei, Passerini, Andrea, Broeck, Guy Van den
Structured output prediction problems are ubiquitous in machine learning. The prominent approach leverages neural networks as powerful feature extractors, otherwise assuming the independence of the outputs. These outputs, however, jointly encode an object, e.g. a path in a graph, and are therefore related through the structure underlying the output space. We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training by minimizing the network's violation of such dependencies, steering the network towards predicting distributions satisfying the underlying structure. At the same time, it is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby, while also enabling efficient end-to-end training and inference. We also discuss key improvements and applications of the semantic loss. One limitations of the semantic loss is that it does not exploit the association of every data point with certain features certifying its membership in a target class. We should therefore prefer minimum-entropy distributions over valid structures, which we obtain by additionally minimizing the neuro-symbolic entropy. We empirically demonstrate the benefits of this more refined formulation. Moreover, the semantic loss is designed to be modular and can be combined with both discriminative and generative neural models. This is illustrated by integrating it into generative adversarial networks, yielding constrained adversarial networks, a novel class of deep generative models able to efficiently synthesize complex objects obeying the structure of the underlying domain.
Towards Logically Consistent Language Models via Probabilistic Reasoning
Calanzone, Diego, Teso, Stefano, Vergari, Antonio
Large language models (LLMs) are a promising venue for natural language understanding and generation tasks. However, current LLMs are far from reliable: they are prone to generate non-factual information and, more crucially, to contradict themselves when prompted to reason about beliefs of the world. These problems are currently addressed with large scale fine-tuning or by delegating consistent reasoning to external tools. In this work, we strive for a middle ground and introduce a training objective based on principled probabilistic reasoning that teaches a LLM to be consistent with external knowledge in the form of a set of facts and rules. Fine-tuning with our loss on a limited set of facts enables our LLMs to be more logically consistent than previous baselines and allows them to extrapolate to unseen but semantically similar factual knowledge more systematically.
Learning To Guide Human Decision Makers With Vision-Language Models
Banerjee, Debodeep, Teso, Stefano, Sayin, Burcu, Passerini, Andrea
There is increasing interest in developing AIs for assisting human decision-making in high-stakes tasks, such as medical diagnosis, for the purpose of improving decision quality and reducing cognitive strain. Mainstream approaches team up an expert with a machine learning model to which safer decisions are offloaded, thus letting the former focus on cases that demand their attention. his separation of responsibilities setup, however, is inadequate for high-stakes scenarios. On the one hand, the expert may end up over-relying on the machine's decisions due to anchoring bias, thus losing the human oversight that is increasingly being required by regulatory agencies to ensure trustworthy AI. On the other hand, the expert is left entirely unassisted on the (typically hardest) decisions on which the model abstained. As a remedy, we introduce learning to guide (LTG), an alternative framework in which - rather than taking control from the human expert - the machine provides guidance useful for decision making, and the human is entirely responsible for coming up with a decision. In order to ensure guidance is interpretable} and task-specific, we develop SLOG, an approach for turning any vision-language model into a capable generator of textual guidance by leveraging a modicum of human feedback. Our empirical evaluation highlights the promise of \method on a challenging, real-world medical diagnosis task.
Neuro-Symbolic Continual Learning: Knowledge, Reasoning Shortcuts and Concept Rehearsal
Marconato, Emanuele, Bontempo, Gianpaolo, Ficarra, Elisa, Calderara, Simone, Passerini, Andrea, Teso, Stefano
We initiate the study of Neuro-Symbolic Continual Learning (NeSy-CL), in which the goal is to solve a sequence We introduce Neuro-Symbolic Continual Learning, of neuro-symbolic tasks. As is common in neuro-symbolic where a model has to solve a sequence of (NeSy) prediction (Manhaeve et al., 2018; Xu et al., 2018; neuro-symbolic tasks, that is, it has to map subsymbolic Giunchiglia & Lukasiewicz, 2020; Hoernle et al., 2022; inputs to high-level concepts and compute Ahmed et al., 2022a), the machine is provided prior knowledge predictions by reasoning consistently with relating one or more target labels to symbolic, highlevel prior knowledge. Our key observation is that concepts extracted from sub-symbolic data, and has to neuro-symbolic tasks, although different, often compute a prediction by reasoning over said concepts. The share concepts whose semantics remains stable central challenge of Nesy-CL is that the data distribution over time. Traditional approaches fall short: existing and the knowledge may vary across tasks. E.g., in medical continual strategies ignore knowledge altogether, diagnosis knowledge may encode known relationships between while stock neuro-symbolic architectures possible symptoms and conditions, while different suffer from catastrophic forgetting. We show that tasks are characterized by different distributions of X-ray leveraging prior knowledge by combining neurosymbolic scans, symptoms and conditions. The goal, as in continual architectures with continual strategies learning (CL) (Parisi et al., 2019), is to obtain a model that does help avoid catastrophic forgetting, but also attains high accuracy on new tasks without forgetting what that doing so can yield models affected by reasoning it has already learned under a limited storage budget.
Not All Neuro-Symbolic Concepts Are Created Equal: Analysis and Mitigation of Reasoning Shortcuts
Marconato, Emanuele, Teso, Stefano, Vergari, Antonio, Passerini, Andrea
Neuro-Symbolic (NeSy) predictive models hold the promise of improved compliance with given constraints, systematic generalization, and interpretability, as they allow to infer labels that are consistent with some prior knowledge by reasoning over high-level concepts extracted from sub-symbolic inputs. It was recently shown that NeSy predictors are affected by reasoning shortcuts: they can attain high accuracy but by leveraging concepts with unintended semantics, thus coming short of their promised advantages. Yet, a systematic characterization of reasoning shortcuts and of potential mitigation strategies is missing. This work fills this gap by characterizing them as unintended optima of the learning objective and identifying four key conditions behind their occurrence. Based on this, we derive several natural mitigation strategies, and analyze their efficacy both theoretically and empirically. Our analysis shows reasoning shortcuts are difficult to deal with, casting doubts on the trustworthiness and interpretability of existing NeSy solutions.
Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation Learning
Marconato, Emanuele, Passerini, Andrea, Teso, Stefano
Focus in Explainable AI is shifting from explanations defined in terms of low-level elements, such as input features, to explanations encoded in terms of interpretable concepts learned from data. How to reliably acquire such concepts is, however, still fundamentally unclear. An agreed-upon notion of concept interpretability is missing, with the result that concepts used by both post-hoc explainers and concept-based neural networks are acquired through a variety of mutually incompatible strategies. Critically, most of these neglect the human side of the problem: a representation is understandable only insofar as it can be understood by the human at the receiving end. The key challenge in Human-interpretable Representation Learning (HRL) is how to model and operationalize this human element. In this work, we propose a mathematical framework for acquiring interpretable representations suitable for both post-hoc explainers and concept-based neural networks. Our formalization of HRL builds on recent advances in causal representation learning and explicitly models a human stakeholder as an external observer. This allows us to derive a principled notion of alignment between the machine representation and the vocabulary of concepts understood by the human. In doing so, we link alignment and interpretability through a simple and intuitive name transfer game, and clarify the relationship between alignment and a well-known property of representations, namely disentanglment. We also show that alignment is linked to the issue of undesirable correlations among concepts, also known as concept leakage, and to content-style separation, all through a general information-theoretic reformulation of these properties. Our conceptualization aims to bridge the gap between the human and algorithmic sides of interpretability and establish a stepping stone for new research on human-interpretable representations.
Learning to Guide Human Experts via Personalized Large Language Models
Banerjee, Debodeep, Teso, Stefano, Passerini, Andrea
Consider the problem of diagnosing lung pathologies based on the outcome of an X-ray scan. This task cannot be fully automated, for safety reasons, necessitating human supervision at some step of the process. At the same time, it is difficult for human experts to tackle it alone due to how sensitive the decision is, especially under time pressure. High-stakes tasks like this are natural candidates for hybrid decision making (HDM) approaches that support human decision makers by leveraging AI technology for the purpose of improving decision quality and lowering cognitive effort, without compromising control. Most current approaches to HDM rely on a learning to defer (LTD) setup, in which a machine learning model first assesses whether a decision can be taken in autonomy - i.e., it is either safe or can be answered with confidence - and defers it to a human partner whenever this is not the case [Madras et al., 2018, Mozannar and Sontag, 2020, Keswani et al., 2022, Verma and Nalisnick, 2022, Liu et al., 2022]. Other forms of HDM, like learning to complement [Wilder et al., 2021], prediction under human assistance [De et al., 2020], and algorithmic triage [Raghu et al., 2019, Okati et al., 2021] follow a similar pattern.