AITopics | Silvestri, Fabrizio

Collaborating Authors

Silvestri, Fabrizio

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Majority Vote Paradigm Shift: When Popular Meets Optimal

Purificato, Antonio, Bucarelli, Maria Sofia, Nelakanti, Anil Kumar, Bacciu, Andrea, Silvestri, Fabrizio, Mantrach, Amin

arXiv.org Machine LearningFeb-18-2025

Reliably labelling data typically requires annotations from multiple human workers. However, humans are far from being perfect. Hence, it is a common practice to aggregate labels gathered from multiple annotators to make a more confident estimate of the true label. Among many aggregation methods, the simple and well known Majority Vote (MV) selects the class label polling the highest number of votes. However, despite its importance, the optimality of MV's label aggregation has not been extensively studied. We address this gap in our work by characterising the conditions under which MV achieves the theoretically optimal lower bound on label estimation error. Our results capture the tolerable limits on annotation noise under which MV can optimally recover labels for a given class distribution. This certificate of optimality provides a more principled approach to model selection for label aggregation as an alternative to otherwise inefficient practices that sometimes include higher experts, gold labels, etc., that are all marred by the same human uncertainty despite huge time and monetary costs. Experiments on both synthetic and real world data corroborate our theoretical findings.

data mining, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2502.12581

Country:

Europe (0.45)
Asia (0.45)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Communications (0.93)
(4 more...)

Add feedback

COMBINEX: A Unified Counterfactual Explainer for Graph Neural Networks via Node Feature and Structural Perturbations

Giorgi, Flavio, Silvestri, Fabrizio, Tolomei, Gabriele

arXiv.org Artificial IntelligenceFeb-14-2025

Counterfactual explanations have emerged as a powerful tool to unveil the opaque decision-making processes of graph neural networks (GNNs). However, existing techniques primarily focus on edge modifications, often overlooking the crucial role of node feature perturbations in shaping model predictions. To address this limitation, we propose COMBINEX, a novel GNN explainer that generates counterfactual explanations for both node and graph classification tasks. Unlike prior methods, which treat structural and feature-based changes independently, COMBINEX optimally balances modifications to edges and node features by jointly optimizing these perturbations. This unified approach ensures minimal yet effective changes required to flip a model's prediction, resulting in realistic and interpretable counterfactuals. Additionally, COMBINEX seamlessly handles both continuous and discrete node features, enhancing its versatility across diverse datasets and GNN architectures. Extensive experiments on real-world datasets and various GNN architectures demonstrate the effectiveness and robustness of our approach over existing baselines.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.10111

Country:

North America > United States (0.68)
Europe (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generalizability through Explainability: Countering Overfitting with Counterfactual Examples

Giorgi, Flavio, Veglianti, Fabiano, Silvestri, Fabrizio, Tolomei, Gabriele

arXiv.org Artificial IntelligenceFeb-13-2025

Overfitting is a well-known issue in machine learning that occurs when a model struggles to generalize its predictions to new, unseen data beyond the scope of its training set. Traditional techniques to mitigate overfitting include early stopping, data augmentation, and regularization. In this work, we demonstrate that the degree of overfitting of a trained model is correlated with the ability to generate counterfactual examples. The higher the overfitting, the easier it will be to find a valid counterfactual example for a randomly chosen input data point. Therefore, we introduce CF-Reg, a novel regularization term in the training loss that controls overfitting by ensuring enough margin between each instance and its corresponding counterfactual. Experiments conducted across multiple datasets and models show that our counterfactual regularizer generally outperforms existing regularization techniques.

artificial intelligence, counterfactual example, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.09193

Country: North America > United States (0.95)

Genre: Research Report > New Finding (0.69)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Task Singular Vectors: Reducing Task Interference in Model Merging

Gargiulo, Antonio Andrea, Crisostomi, Donato, Bucarelli, Maria Sofia, Scardapane, Simone, Silvestri, Fabrizio, Rodolà, Emanuele

arXiv.org Machine LearningJan-2-2025

Task Arithmetic has emerged as a simple yet effective method to merge models without additional training. However, by treating entire networks as flat parameter vectors, it overlooks key structural information and is susceptible to task interference. In this paper, we study task vectors at the layer level, focusing on task layer matrices and their singular value decomposition. In particular, we concentrate on the resulting singular vectors, which we refer to as Task Singular Vectors (TSV). Recognizing that layer task matrices are often low-rank, we propose TSV-Compress (TSV-C), a simple procedure that compresses them to 10% of their original size while retaining 99% of accuracy. We further leverage this low-rank space to define a new measure of task interference based on the interaction of singular vectors from different tasks. Building on these findings, we introduce TSV-Merge (TSV-M), a novel model merging approach that combines compression with interference reduction, significantly outperforming existing methods.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2412.00081

Country:

North America > United States (0.28)
North America > Canada > Ontario (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robustness of Graph Classification: failure modes, causes, and noise-resistant loss in Graph Neural Networks

Wani, Farooq Ahmad, Bucarelli, Maria Sofia, Di Francesco, Andrea Giuseppe, Pryymak, Oleksandr, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceDec-11-2024

Graph Neural Networks (GNNs) are powerful at solving graph classification tasks, yet applied problems often contain noisy labels. In this work, we study GNN robustness to label noise, demonstrate GNN failure modes when models struggle to generalise on low-order graphs, low label coverage, or when a model is over-parameterized. We establish both empirical and theoretical links between GNN robustness and the reduction of the total Dirichlet Energy of learned node representations, which encapsulates the hypothesized GNN smoothness inductive bias. Finally, we introduce two training strategies to enhance GNN robustness: (1) by incorporating a novel inductive bias in the weight matrices through the removal of negative eigenvalues, connected to Dirichlet Energy minimization; (2) by extending to GNNs a loss penalty that promotes learned smoothness. Importantly, neither approach negatively impacts performance in noise-free settings, supporting our hypothesis that the source of GNNs robustness is their smoothness inductive bias.

artificial intelligence, machine learning, noise, (14 more...)

arXiv.org Artificial Intelligence

2412.08419

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ATM: Improving Model Merging by Alternating Tuning and Merging

Zhou, Luca, Solombrino, Daniele, Crisostomi, Donato, Bucarelli, Maria Sofia, Silvestri, Fabrizio, Rodolà, Emanuele

arXiv.org Artificial IntelligenceNov-6-2024

Model merging has recently emerged as a cost-efficient paradigm for multi-task learning. Among current approaches, task arithmetic stands out for its simplicity and effectiveness. In this paper, we motivate the effectiveness of task vectors by linking them to multi-task gradients. We show that in a single-epoch scenario, task vectors are mathematically equivalent to the gradients obtained via gradient descent in a multi-task setting, and still approximate these gradients in subsequent epochs. Furthermore, we show that task vectors perform optimally when equality is maintained, and their effectiveness is largely driven by the first epoch's gradient. Building on this insight, we propose viewing model merging as a single step in an iterative process that Alternates between Tuning and Merging (ATM). This method acts as a bridge between model merging and multi-task gradient descent, achieving state-of-the-art results with the same data and computational requirements. We extensively evaluate ATM across diverse settings, achieving up to 20% higher accuracy in computer vision and NLP tasks, compared to the best baselines. Finally, we provide both empirical and theoretical support for its effectiveness, demonstrating increased orthogonality between task vectors and proving that ATM minimizes an upper bound on the loss obtained by jointly finetuning all tasks.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.03055

Country: North America > United States > Maryland (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Beyond position: how rotary embeddings shape representations and memory in autoregressive transfomers

Ruscio, Valeria, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceOct-25-2024

Rotary Positional Embeddings (RoPE) enhance positional encoding in Transformer models, yet their full impact on model dynamics remains underexplored. This paper studies how RoPE introduces position-dependent rotations, causing phase shifts in token embeddings that influence higher-frequency components within the model's internal representations. Through spectral analysis, we demonstrate that RoPE's rotation matrices induce oscillatory behaviors in embeddings, affecting information retention across layers and shaping temporal modeling capabilities. We show that activation functions in feed-forward networks interact with RoPE-modulated embeddings to generate harmonics, leading to constructive or destructive interference based on phase alignment. Our findings reveal that phase alignment amplifies activations and sharpens attention, while misalignment weakens activations and disrupts focus on positional patterns. This study underscores the importance of frequency components as intrinsic elements of model behavior, offering new insights beyond traditional analyses.

information, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.18067

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Eco-Aware Graph Neural Networks for Sustainable Recommendations

Purificato, Antonio, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceOct-12-2024

Recommender systems play a crucial role in alleviating information overload by providing personalized recommendations tailored to users' preferences and interests. Recently, Graph Neural Networks (GNNs) have emerged as a promising approach for recommender systems, leveraging their ability to effectively capture complex relationships and dependencies between users and items by representing them as nodes in a graph structure. In this study, we investigate the environmental impact of GNN-based recommender systems, an aspect that has been largely overlooked in the literature. Specifically, we conduct a comprehensive analysis of the carbon emissions associated with training and deploying GNN models for recommendation tasks. We evaluate the energy consumption and carbon footprint of different GNN architectures and configurations, considering factors such as model complexity, training duration, hardware specifications and embedding size. By addressing the environmental impact of resource-intensive algorithms in recommender systems, this study contributes to the ongoing efforts towards sustainable and responsible artificial intelligence, promoting the development of eco-friendly recommendation technologies that balance performance and environmental considerations.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2410.09514

Country:

Europe (1.00)
Asia (0.69)
North America > United States (0.33)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Environmental Law (0.58)
Energy > Oil & Gas (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Natural Language Counterfactual Explanations for Graphs Using Large Language Models

Giorgi, Flavio, Campagnano, Cesare, Silvestri, Fabrizio, Tolomei, Gabriele

arXiv.org Artificial IntelligenceOct-11-2024

Explainable Artificial Intelligence (XAI) has emerged as a critical area of research to unravel the opaque inner logic of (deep) machine learning models. Among the various XAI techniques proposed in the literature, counterfactual explanations stand out as one of the most promising approaches. However, these ``what-if'' explanations are frequently complex and technical, making them difficult for non-experts to understand and, more broadly, challenging for humans to interpret. To bridge this gap, in this work, we exploit the power of open-source Large Language Models to generate natural language explanations when prompted with valid counterfactual instances produced by state-of-the-art explainers for graph-based models. Experiments across several graph datasets and counterfactual explainers show that our approach effectively produces accurate natural language representations of counterfactual instances, as demonstrated by key performance metrics.

explanation, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.09295

Country: Europe (0.69)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Government (1.00)
Information Technology > Security & Privacy (0.95)
Law (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems

Cuconasu, Florin, Trappolini, Giovanni, Tonellotto, Nicola, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceJun-21-2024

Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are aligned with human preferences using state-of-the-art techniques. Contrary to popular belief, our study demonstrates that base models outperform their instructed counterparts in RAG tasks by 20% on average under our experimental settings. This finding challenges the prevailing assumptions about the superiority of instructed LLMs in RAG applications. Further investigations reveal a more nuanced situation, questioning fundamental aspects of RAG and suggesting the need for broader discussions on the topic; or, as Fromm would have it, "Seldom is a glance at the statistics enough to understand the meaning of the figures".

large language model, llama 2, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2406.14972

Country:

Asia (0.68)
North America > United States > New York (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback