AITopics | Bacciu, Andrea

Plotting

Bacciu, Andrea

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods

Cecere, Nicola, Bacciu, Andrea, Tobías, Ignacio Fernández, Mantrach, Amin

arXiv.org Artificial IntelligenceFeb-25-2025

Uncertainty quantification (UQ) in Large Language Models (LLMs) is essential for their safe and reliable deployment, particularly in critical applications where incorrect outputs can have serious consequences. Current UQ methods typically rely on querying the model multiple times using non-zero temperature sampling to generate diverse outputs for uncertainty estimation. However, the impact of selecting a given temperature parameter is understudied, and our analysis reveals that temperature plays a fundamental role in the quality of uncertainty estimates. The conventional approach of identifying optimal temperature values requires expensive hyperparameter optimization (HPO) that must be repeated for each new model-dataset combination. We propose Monte Carlo Temperature (MCT), a robust sampling strategy that eliminates the need for temperature calibration. Our analysis reveals that: 1) MCT provides more robust uncertainty estimates across a wide range of temperatures, 2) MCT improves the performance of UQ methods by replacing fixed-temperature strategies that do not rely on HPO, and 3) MCT achieves statistical parity with oracle temperatures, which represent the ideal outcome of a well-tuned but computationally expensive HPO process. These findings demonstrate that effective UQ can be achieved without the computational burden of temperature parameter calibration.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2502.18389

Country: North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

The Majority Vote Paradigm Shift: When Popular Meets Optimal

Purificato, Antonio, Bucarelli, Maria Sofia, Nelakanti, Anil Kumar, Bacciu, Andrea, Silvestri, Fabrizio, Mantrach, Amin

arXiv.org Machine LearningFeb-18-2025

Reliably labelling data typically requires annotations from multiple human workers. However, humans are far from being perfect. Hence, it is a common practice to aggregate labels gathered from multiple annotators to make a more confident estimate of the true label. Among many aggregation methods, the simple and well known Majority Vote (MV) selects the class label polling the highest number of votes. However, despite its importance, the optimality of MV's label aggregation has not been extensively studied. We address this gap in our work by characterising the conditions under which MV achieves the theoretically optimal lower bound on label estimation error. Our results capture the tolerable limits on annotation noise under which MV can optimally recover labels for a given class distribution. This certificate of optimality provides a more principled approach to model selection for label aggregation as an alternative to otherwise inefficient practices that sometimes include higher experts, gold labels, etc., that are all marred by the same human uncertainty despite huge time and monetary costs. Experiments on both synthetic and real world data corroborate our theoretical findings.

data mining, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2502.12581

Country:

Europe (0.45)
Asia (0.45)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Communications (0.93)
(4 more...)

Add feedback

Handling Ontology Gaps in Semantic Parsing

Bacciu, Andrea, Damonte, Marco, Basaldella, Marco, Monti, Emilio

arXiv.org Artificial IntelligenceJun-27-2024

The majority of Neural Semantic Parsing (NSP) models are developed with the assumption that there are no concepts outside the ones such models can represent with their target symbols (closed-world assumption). This assumption leads to generate hallucinated outputs rather than admitting their lack of knowledge. Hallucinations can lead to wrong or potentially offensive responses to users. Hence, a mechanism to prevent this behavior is crucial to build trusted NSP-based Question Answering agents. To that end, we propose the Hallucination Simulation Framework (HSF), a general setting for stimulating and analyzing NSP model hallucinations. The framework can be applied to any NSP task with a closed-ontology. Using the proposed framework and KQA Pro as the benchmark dataset, we assess state-of-the-art techniques for hallucination detection. We then present a novel hallucination detection strategy that exploits the computational graph of the NSP model to detect the NSP hallucinations in the presence of ontology gaps, out-of-domain utterances, and to recognize NSP errors, improving the F1-Score respectively by ~21, ~24% and ~1%. This is the first work in closed-ontology NSP that addresses the problem of recognizing ontology gaps. We release our code and checkpoints at https://github.com/amazon-science/handling-ontology-gaps-in-semantic-parsing.

artificial intelligence, natural language, nsp model, (17 more...)

arXiv.org Artificial Intelligence

2406.19537

Country:

Europe > France (0.14)
North America > Canada (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

RRAML: Reinforced Retrieval Augmented Machine Learning

Bacciu, Andrea, Cuconasu, Florin, Siciliano, Federico, Silvestri, Fabrizio, Tonellotto, Nicola, Trappolini, Giovanni

arXiv.org Artificial IntelligenceJul-27-2023

The emergence of large language models (LLMs) has revolutionized machine learning and related fields, showcasing remarkable abilities in comprehending, generating, and manipulating human language. However, their conventional usage through API-based text prompt submissions imposes certain limitations in terms of context constraints and external source availability. To address these challenges, we propose a novel framework called Reinforced Retrieval Augmented Machine Learning (RRAML). RRAML integrates the reasoning capabilities of LLMs with supporting information retrieved by a purpose-built retriever from a vast user-provided database. By leveraging recent advancements in reinforcement learning, our method effectively addresses several critical challenges. Firstly, it circumvents the need for accessing LLM gradients. Secondly, our method alleviates the burden of retraining LLMs for specific tasks, as it is often impractical or impossible due to restricted access to the model and the computational intensity involved. Additionally we seamlessly link the retriever's task with the reasoner, mitigating hallucinations and reducing irrelevant, and potentially damaging retrieved documents. We believe that the research agenda outlined in this paper has the potential to profoundly impact the field of AI, democratizing access to and utilization of LLMs for a wide range of entities.

arxiv preprint arxiv, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2307.12798

Country: Europe > Spain (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fauno: The Italian Large Language Model that will leave you senza parole!

Bacciu, Andrea, Trappolini, Giovanni, Santilli, Andrea, Rodolà, Emanuele, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceJun-26-2023

This paper presents Fauno, the first and largest open-source Italian conversational Large Language Model (LLM). Our goal with Fauno is to democratize the study of LLMs in Italian, demonstrating that obtaining a fine-tuned conversational bot with a single GPU is possible. In addition, we release a collection of datasets for conversational AI in Italian. The datasets on which we fine-tuned Fauno include various topics such as general question answering, computer science, and medical questions.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.14457

Country: Europe > Italy (0.29)

Genre: Research Report (0.50)

Industry: Health & Medicine > Health Care Technology > Telehealth (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Integrating Item Relevance in Training Loss for Sequential Recommender Systems

Bacciu, Andrea, Siciliano, Federico, Tonellotto, Nicola, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceJun-10-2023

Sequential Recommender Systems (SRSs) are a popular type of recommender system that learns from a user's history to predict the next item they are likely to interact with. However, user interactions can be affected by noise stemming from account sharing, inconsistent preferences, or accidental clicks. To address this issue, we (i) propose a new evaluation protocol that takes multiple future items into account and (ii) introduce a novel relevance-aware loss function to train a SRS with multiple future items to make it more robust to noise. Our relevance-aware models obtain an improvement of ~1.2% of NDCG@10 and 0.88% in the traditional evaluation protocol, while in the new evaluation protocol, the improvement is ~1.63% of NDCG@10 and ~1.5% of HR w.r.t the best performing models.

artificial intelligence, evaluation protocol, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3604915.3610643

2305.10824

Country: Europe > Italy (0.31)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Renormalized Graph Neural Networks

Caso, Francesco, Trappolini, Giovanni, Bacciu, Andrea, Liò, Pietro, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceJun-1-2023

Graph Neural Networks (GNNs) have become essential for studying complex data, particularly when represented as graphs. Their value is underpinned by their ability to reflect the intricacies of numerous areas, ranging from social to biological networks. GNNs can grapple with non-linear behaviors, emerging patterns, and complex connections; these are also typical characteristics of complex systems. The renormalization group (RG) theory has emerged as the language for studying complex systems. It is recognized as the preferred lens through which to study complex systems, offering a framework that can untangle their intricate dynamics. Despite the clear benefits of integrating RG theory with GNNs, no existing methods have ventured into this promising territory. This paper proposes a new approach that applies RG theory to devise a novel graph rewiring to improve GNNs' performance on graph-related tasks. We support our proposal with extensive experiments on standard benchmarks and baselines. The results demonstrate the effectiveness of our method and its potential to remedy the current limitations of GNNs. Finally, this paper marks the beginning of a new research direction. This path combines the theoretical foundations of RG, the magnifying glass of complex systems, with the structural capabilities of GNNs. By doing so, we aim to enhance the potential of GNNs in modeling and unraveling the complexities inherent in diverse systems.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Artificial Intelligence

2306.00707

Country:

Europe > Italy (0.15)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback