AITopics | Tonellotto, Nicola

Collaborating Authors

Tonellotto, Nicola

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers

Pezzuti, Francesca, MacAvaney, Sean, Tonellotto, Nicola

arXiv.org Artificial IntelligenceMar-28-2025

State-of-the-art cross-encoders can be fine-tuned to be highly effective in passage re-ranking. The typical fine-tuning process of cross-encoders as re-rankers requires large amounts of manually labelled data, a contrastive learning objective, and a set of heuristically sampled negatives. An alternative recent approach for fine-tuning instead involves teaching the model to mimic the rankings of a highly effective large language model using a distillation objective. These fine-tuning strategies can be applied either individually, or in sequence. In this work, we systematically investigate the effectiveness of point-wise cross-encoders when fine-tuned independently in a single stage, or sequentially in two stages. Our experiments show that the effectiveness of point-wise cross-encoders fine-tuned using contrastive learning is indeed on par with that of models fine-tuned with multi-stage approaches. Code is available for reproduction at https://github.com/fpezzuti/multistage-finetuning.

effectiveness, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.22672

Country: Europe > Italy > Tuscany > Pisa Province > Pisa (0.40)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems

Cuconasu, Florin, Trappolini, Giovanni, Tonellotto, Nicola, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceJun-21-2024

Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are aligned with human preferences using state-of-the-art techniques. Contrary to popular belief, our study demonstrates that base models outperform their instructed counterparts in RAG tasks by 20% on average under our experimental settings. This finding challenges the prevailing assumptions about the superiority of instructed LLMs in RAG applications. Further investigations reveal a more nuanced situation, questioning fundamental aspects of RAG and suggesting the need for broader discussions on the topic; or, as Fromm would have it, "Seldom is a glance at the statistics enough to understand the meaning of the figures".

large language model, llama 2, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2406.14972

Country:

Asia (0.68)
North America > United States > New York (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Reproducibility Study of PLAID

MacAvaney, Sean, Tonellotto, Nicola

arXiv.org Artificial IntelligenceApr-23-2024

The PLAID (Performance-optimized Late Interaction Driver) algorithm for ColBERTv2 uses clustered term representations to retrieve and progressively prune documents for final (exact) document scoring. In this paper, we reproduce and fill in missing gaps from the original work. By studying the parameters PLAID introduces, we find that its Pareto frontier is formed of a careful balance among its three parameters; deviations beyond the suggested settings can substantially increase latency without necessarily improving its effectiveness. We then compare PLAID with an important baseline missing from the paper: re-ranking a lexical system. We find that applying ColBERTv2 as a re-ranker atop an initial pool of BM25 results provides better efficiency-effectiveness trade-offs in low-latency settings. However, re-ranking cannot reach peak effectiveness at higher latency settings due to limitations in recall of lexical matching and provides a poor approximation of an exhaustive ColBERTv2 search. We find that recently proposed modifications to re-ranking that pull in the neighbors of top-scoring documents overcome this limitation, providing a Pareto frontier across all operational points for ColBERTv2 when evaluated using a well-annotated dataset. Curious about why re-ranking methods are highly competitive with PLAID, we analyze the token representation clusters PLAID uses for retrieval and find that most clusters are predominantly aligned with a single token and vice versa. Given the competitive trade-offs that re-ranking baselines exhibit, this work highlights the importance of carefully selecting pertinent baselines when evaluating the efficiency of retrieval engines.

machine learning, natural language, plaid, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3657856

2404.14989

Country:

Europe (1.00)
North America > United States > Maryland (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The Power of Noise: Redefining Retrieval for RAG Systems

Cuconasu, Florin, Trappolini, Giovanni, Siciliano, Federico, Filice, Simone, Campagnano, Cesare, Maarek, Yoelle, Tonellotto, Nicola, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceJan-29-2024

Retrieval-Augmented Generation (RAG) systems represent a significant advancement over traditional Large Language Models (LLMs). RAG systems enhance their generation ability by incorporating external data retrieved through an Information Retrieval (IR) phase, overcoming the limitations of standard LLMs, which are restricted to their pre-trained knowledge and limited context window. Most research in this area has predominantly concentrated on the generative aspect of LLMs within RAG systems. Our study fills this gap by thoroughly and critically analyzing the influence of IR components on RAG systems. This paper analyzes which characteristics a retriever should possess for an effective RAG's prompt formulation, focusing on the type of documents that should be retrieved. We evaluate various elements, such as the relevance of the documents to the prompt, their position, and the number included in the context. Our findings reveal, among other insights, that including irrelevant documents can unexpectedly enhance performance by more than 30% in accuracy, contradicting our initial assumption of diminished quality. These results underscore the need for developing specialized strategies to integrate retrieval with language generation models, thereby laying the groundwork for future research in this field.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2401.14887

Country:

Europe (1.00)
Asia > Middle East > Israel (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RRAML: Reinforced Retrieval Augmented Machine Learning

Bacciu, Andrea, Cuconasu, Florin, Siciliano, Federico, Silvestri, Fabrizio, Tonellotto, Nicola, Trappolini, Giovanni

arXiv.org Artificial IntelligenceJul-27-2023

The emergence of large language models (LLMs) has revolutionized machine learning and related fields, showcasing remarkable abilities in comprehending, generating, and manipulating human language. However, their conventional usage through API-based text prompt submissions imposes certain limitations in terms of context constraints and external source availability. To address these challenges, we propose a novel framework called Reinforced Retrieval Augmented Machine Learning (RRAML). RRAML integrates the reasoning capabilities of LLMs with supporting information retrieved by a purpose-built retriever from a vast user-provided database. By leveraging recent advancements in reinforcement learning, our method effectively addresses several critical challenges. Firstly, it circumvents the need for accessing LLM gradients. Secondly, our method alleviates the burden of retraining LLMs for specific tasks, as it is often impractical or impossible due to restricted access to the model and the computational intensity involved. Additionally we seamlessly link the retriever's task with the reasoner, mitigating hallucinations and reducing irrelevant, and potentially damaging retrieved documents. We believe that the research agenda outlined in this paper has the potential to profoundly impact the field of AI, democratizing access to and utilization of LLMs for a wide range of entities.

arxiv preprint arxiv, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2307.12798

Country: Europe > Spain (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Integrating Item Relevance in Training Loss for Sequential Recommender Systems

Bacciu, Andrea, Siciliano, Federico, Tonellotto, Nicola, Silvestri, Fabrizio

arXiv.org Artificial IntelligenceJun-10-2023

Sequential Recommender Systems (SRSs) are a popular type of recommender system that learns from a user's history to predict the next item they are likely to interact with. However, user interactions can be affected by noise stemming from account sharing, inconsistent preferences, or accidental clicks. To address this issue, we (i) propose a new evaluation protocol that takes multiple future items into account and (ii) introduce a novel relevance-aware loss function to train a SRS with multiple future items to make it more robust to noise. Our relevance-aware models obtain an improvement of ~1.2% of NDCG@10 and 0.88% in the traditional evaluation protocol, while in the new evaluation protocol, the improvement is ~1.63% of NDCG@10 and ~1.5% of HR w.r.t the best performing models.

artificial intelligence, evaluation protocol, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3604915.3610643

2305.10824

Country: Europe > Italy (0.31)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A Federated Channel Modeling System using Generative Neural Networks

Bano, Saira, Cassarà, Pietro, Tonellotto, Nicola, Gotta, Alberto

arXiv.org Artificial IntelligenceMay-30-2023

The paper proposes a data-driven approach to air-to-ground channel estimation in a millimeter-wave wireless network on an unmanned aerial vehicle. Unlike traditional centralized learning methods that are specific to certain geographical areas and inappropriate for others, we propose a generalized model that uses Federated Learning (FL) for channel estimation and can predict the air-to-ground path loss between a low-altitude platform and a terrestrial terminal. To this end, our proposed FL-based Generative Adversarial Network (FL-GAN) is designed to function as a generative data model that can learn different types of data distributions and generate realistic patterns from the same distributions without requiring prior data analysis before the training phase. To evaluate the effectiveness of the proposed model, we evaluate its performance using Kullback-Leibler divergence (KL), and Wasserstein distance between the synthetic data distribution generated by the model and the actual data distribution. We also compare the proposed technique with other generative models, such as FL-Variational Autoencoder (FL-VAE) and stand-alone VAE and GAN models. The results of the study show that the synthetic data generated by FL-GAN has the highest similarity in distribution with the real data. This shows the effectiveness of the proposed approach in generating data-driven channel models that can be used in different regions

artificial intelligence, channel parameter, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.18856

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.67)

Industry: