AITopics | Scardapane, Simone

Collaborating Authors

Scardapane, Simone

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection

Genovese, Donatella, Sgroi, Alessandro, Devoto, Alessio, Valentine, Samuel, Wood, Lennox, Sebastiani, Cristiano, Giagu, Stefano, D'Onofrio, Monica, Scardapane, Simone

arXiv.org Artificial IntelligenceJan-8-2025

The Large Hadron Collider at CERN produces immense volumes of complex data from high-energy particle collisions, demanding sophisticated analytical techniques for effective interpretation. Neural Networks, including Graph Neural Networks, have shown promise in tasks such as event classification and object identification by representing collisions as graphs. However, while Graph Neural Networks excel in predictive accuracy, their "black box" nature often limits their interpretability, making it difficult to trust their decision-making processes. In this paper, we propose a novel approach that combines a Graph Transformer model with Mixture-of-Expert layers to achieve high predictive performance while embedding interpretability into the architecture. By leveraging attention maps and expert specialization, the model offers insights into its internal decision-making, linking predictions to physics-informed features. We evaluate the model on simulated events from the ATLAS experiment, focusing on distinguishing rare Supersymmetric signal events from Standard Model background. Our results highlight that the model achieves competitive classification accuracy while providing interpretable outputs that align with known physics, demonstrating its potential as a robust and transparent tool for high-energy physics data analysis. This approach underscores the importance of explainability in machine learning methods applied to high energy physics, offering a path toward greater trust in AI-driven discoveries.

artificial intelligence, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

2501.03432

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry: Transportation > Air (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Task Singular Vectors: Reducing Task Interference in Model Merging

Gargiulo, Antonio Andrea, Crisostomi, Donato, Bucarelli, Maria Sofia, Scardapane, Simone, Silvestri, Fabrizio, Rodolà, Emanuele

arXiv.org Machine LearningJan-2-2025

Task Arithmetic has emerged as a simple yet effective method to merge models without additional training. However, by treating entire networks as flat parameter vectors, it overlooks key structural information and is susceptible to task interference. In this paper, we study task vectors at the layer level, focusing on task layer matrices and their singular value decomposition. In particular, we concentrate on the resulting singular vectors, which we refer to as Task Singular Vectors (TSV). Recognizing that layer task matrices are often low-rank, we propose TSV-Compress (TSV-C), a simple procedure that compresses them to 10% of their original size while retaining 99% of accuracy. We further leverage this low-rank space to define a new measure of task interference based on the interaction of singular vectors from different tasks. Building on these findings, we introduce TSV-Merge (TSV-M), a novel model merging approach that combines compression with interference reduction, significantly outperforming existing methods.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2412.00081

Country:

North America > United States (0.28)
North America > Canada > Ontario (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Goal-oriented Communications based on Recursive Early Exit Neural Networks

Pomponi, Jary, Merluzzi, Mattia, Devoto, Alessio, Mota, Mateus Pontes, Di Lorenzo, Paolo, Scardapane, Simone

arXiv.org Artificial IntelligenceDec-27-2024

This paper presents a novel framework for goal-oriented semantic communications leveraging recursive early exit models. The proposed approach is built on two key components. First, we introduce an innovative early exit strategy that dynamically partitions computations, enabling samples to be offloaded to a server based on layer-wise recursive prediction dynamics that detect samples for which the confidence is not increasing fast enough over layers. Second, we develop a Reinforcement Learning-based online optimization framework that jointly determines early exit points, computation splitting, and offloading strategies, while accounting for wireless conditions, inference accuracy, and resource costs. Numerical evaluations in an edge inference scenario demonstrate the method's adaptability and effectiveness in striking an excellent trade-off between performance, latency, and resource efficiency.

artificial intelligence, computation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.19587

Country: Europe (0.69)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not

Verdini, Francesco, Melucci, Pierfrancesco, Perna, Stefano, Cariaggi, Francesco, Gaido, Marco, Papi, Sara, Mazurek, Szymon, Kasztelnik, Marek, Bentivogli, Luisa, Bratières, Sébastien, Merialdo, Paolo, Scardapane, Simone

arXiv.org Artificial IntelligenceNov-8-2024

The remarkable performance achieved by Large Language Models (LLM) has driven research efforts to leverage them for a wide range of tasks and input modalities. In speech-to-text (S2T) tasks, the emerging solution consists of projecting the output of the encoder of a Speech Foundational Model (SFM) into the LLM embedding space through an adapter module. However, no work has yet investigated how much the downstream-task performance depends on each component (SFM, adapter, LLM) nor whether the best design of the adapter depends on the chosen SFM and LLM. To fill this gap, we evaluate the combination of 5 adapter modules, 2 LLMs (Mistral and Llama), and 2 SFMs (Whisper and SeamlessM4T) on two widespread S2T tasks, namely Automatic Speech Recognition and Speech Translation. Our results demonstrate that the SFM plays a pivotal role in downstream performance, while the adapter choice has moderate impact and depends on the SFM and LLM.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.17044

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Interpreting Temporal Graph Neural Networks with Koopman Theory

Guerra, Michele, Scardapane, Simone, Bianchi, Filippo Maria

arXiv.org Artificial IntelligenceOct-17-2024

Spatiotemporal graph neural networks (STGNNs) have shown promising results in many domains, from forecasting to epidemiology. However, understanding the dynamics learned by these models and explaining their behaviour is significantly more complex than for models dealing with static data. Inspired by Koopman theory, which allows a simpler description of intricate, nonlinear dynamical systems, we introduce an explainability approach for temporal graphs. We present two methods to interpret the STGNN's decision process and identify the most relevant spatial and temporal patterns in the input for the task at hand. The first relies on dynamic mode decomposition (DMD), a Koopman-inspired dimensionality reduction method. The second relies on sparse identification of nonlinear dynamics (SINDy), a popular method for discovering governing equations, which we use for the first time as a general tool for explainability. We show how our methods can correctly identify interpretable features such as infection times and infected nodes in the context of dissemination processes.

data mining, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2410.13469

Country: Europe (0.68)

Genre: Research Report (0.64)

Industry: Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Alice's Adventures in a Differentiable Wonderland -- Volume I, A Tour of the Land

Scardapane, Simone

arXiv.org Artificial IntelligenceJul-4-2024

Neural networks surround us, in the form of large language models, speech transcription systems, molecular discovery algorithms, robotics, and much more. Stripped of anything else, neural networks are compositions of differentiable primitives, and studying them means learning how to program and how to interact with these models, a particular example of what is called differentiable programming. This primer is an introduction to this fascinating field imagined for someone, like Alice, who has just ventured into this strange differentiable wonderland. I overview the basics of optimizing a function via automatic differentiation, and a selection of the most common designs for handling sequences, graphs, texts, and audios. The focus is on a intuitive, self-contained introduction to the most important design techniques, including convolutional, attentional, and recurrent blocks, hoping to bridge the gap between theory and code (PyTorch and JAX) and leaving the reader capable of understanding some of the most advanced models out there, such as large language models (LLMs) and multimodal architectures.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2404.17625

Country:

Europe (0.45)
North America > United States (0.45)

Genre:

Overview (0.92)
Summary/Review (0.92)
Research Report > New Finding (0.67)
(2 more...)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

TopoBenchmarkX: A Framework for Benchmarking Topological Deep Learning

Telyatnikov, Lev, Bernardez, Guillermo, Montagna, Marco, Vasylenko, Pavlo, Zamzmi, Ghada, Hajij, Mustafa, Schaub, Michael T, Miolane, Nina, Scardapane, Simone, Papamarkou, Theodore

arXiv.org Artificial IntelligenceJun-9-2024

This work introduces TopoBenchmarkX, a modular open-source library designed to standardize benchmarking and accelerate research in Topological Deep Learning (TDL). TopoBenchmarkX maps the TDL pipeline into a sequence of independent and modular components for data loading and processing, as well as model training, optimization, and evaluation. This modular organization provides flexibility for modifications and facilitates the adaptation and optimization of various TDL pipelines. A key feature of TopoBenchmarkX is that it allows for the transformation and lifting between topological domains. This enables, for example, to obtain richer data representations and more fine-grained analyses by mapping the topology and features of a graph to higher-order topological domains such as simplicial and cell complexes. The range of applicability of TopoBenchmarkX is demonstrated by benchmarking several TDL architectures for various tasks and datasets.

artificial intelligence, machine learning, topological domain, (18 more...)

arXiv.org Artificial Intelligence

2406.06642

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Semantic Token Selection for AI-native Goal-oriented Communications

Devoto, Alessio, Petruzzi, Simone, Pomponi, Jary, Di Lorenzo, Paolo, Scardapane, Simone

arXiv.org Artificial IntelligenceApr-25-2024

In this paper, we propose a novel design for AI-native goal-oriented communications, exploiting transformer neural networks under dynamic inference constraints on bandwidth and computation. Transformers have become the standard architecture for pretraining large-scale vision and text models, and preliminary results have shown promising performance also in deep joint source-channel coding (JSCC). Here, we consider a dynamic model where communication happens over a channel with variable latency and bandwidth constraints. Leveraging recent works on conditional computation, we exploit the structure of the transformer blocks and the multihead attention operator to design a trainable semantic token selection mechanism that learns to select relevant tokens (e.g., image patches) from the input signal. This is done dynamically, on a per-input basis, with a rate that can be chosen as an additional input by the user. We show that our model improves over state-of-the-art token selection mechanisms, exhibiting high accuracy for a wide range of latency and bandwidth constraints, without the need for deploying multiple architectures tailored to each constraint. Last, but not least, the proposed token selection mechanism helps extract powerful semantics that are easy to understand and explain, paving the way for interpretable-by-design models for the next generation of AI-native communication systems.

artificial intelligence, budget, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2405.0233

Country: Europe > Italy (0.15)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Influence based explainability of brain tumors segmentation in multimodal Magnetic Resonance Imaging

Torda, Tommaso, Ciardiello, Andrea, Gargiulo, Simona, Grillo, Greta, Scardapane, Simone, Voena, Cecilia, Giagu, Stefano

arXiv.org Artificial IntelligenceApr-5-2024

In recent years Artificial Intelligence has emerged as a fundamental tool in medical applications. Despite this rapid development, deep neural networks remain black boxes that are difficult to explain, and this represents a major limitation for their use in clinical practice. We focus on the segmentation of medical images task, where most explainability methods proposed so far provide a visual explanation in terms of an input saliency map. The aim of this work is to extend, implement and test instead an influence-based explainability algorithm, TracIn, proposed originally for classification tasks, in a challenging clinical problem, i.e., multiclass segmentation of tumor brains in multimodal Magnetic Resonance Imaging. We verify the faithfulness of the proposed algorithm linking the similarities of the latent representation of the network to the TracIn output. We further test the capacity of the algorithm to provide local and global explanations, and we suggest that it can be adopted as a tool to select the most relevant features used in the decision process. The method is generalizable for all semantic segmentation tasks where classes are mutually exclusive, which is the standard framework in these cases.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2405.12222

Country: Europe (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Position Paper: Challenges and Opportunities in Topological Deep Learning

Papamarkou, Theodore, Birdal, Tolga, Bronstein, Michael, Carlsson, Gunnar, Curry, Justin, Gao, Yue, Hajij, Mustafa, Kwitt, Roland, Liò, Pietro, Di Lorenzo, Paolo, Maroulas, Vasileios, Miolane, Nina, Nasrin, Farzana, Ramamurthy, Karthikeyan Natesan, Rieck, Bastian, Scardapane, Simone, Schaub, Michael T., Veličković, Petar, Wang, Bei, Wang, Yusu, Wei, Guo-Wei, Zamzmi, Ghada

arXiv.org Machine LearningFeb-13-2024

Traditional machine learning often assumes that the observed data of interest are supported on a linear vector space Topological deep learning (TDL) is a rapidly and can be described by a set of feature vectors. However, evolving field that uses topological features to understand there is growing awareness that, in many cases, this viewpoint and design deep learning models. This is insufficient to describe several data within the real paper posits that TDL may complement graph representation world. For example, molecules may be described more appropriately learning and geometric deep learning by graphs than feature vectors. Other examples by incorporating topological concepts, and can include three-dimensional objects represented by meshes, thus provide a natural choice for various machine as encountered in computer graphics and geometry processing, learning settings. To this end, this paper discusses or data supported on top of a complex social network open problems in TDL, ranging from practical of interrelated actors. Hence, there has been an increased benefits to theoretical foundations. For each problem, interest in importing concepts from geometry and topology it outlines potential solutions and future research into the usual machine learning pipelines to gain further opportunities.

artificial intelligence, machine learning, survey article, (15 more...)

arXiv.org Machine Learning

2402.08871

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report > Promising Solution (0.87)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback