Goto

Collaborating Authors

 serie


Efficient Uncertainty Estimation for LLM-based Entity Linking in Tabular Data

Bono, Carlo, Belotti, Federico, Palmonari, Matteo

arXiv.org Machine Learning

Linking textual values in tabular data to their corresponding entities in a Knowledge Base is a core task across a variety of data integration and enrichment applications. Although Large Language Models (LLMs) have shown State-of-The-Art performance in Entity Linking (EL) tasks, their deployment in real-world scenarios requires not only accurate predictions but also reliable uncertainty estimates, which require resource-demanding multi-shot inference, posing serious limits to their actual applicability. As a more efficient alternative, we investigate a self-supervised approach for estimating uncertainty from single-shot LLM outputs using token-level features, reducing the need for multiple generations. Evaluation is performed on an EL task on tabular data across multiple LLMs, showing that the resulting uncertainty estimates are highly effective in detecting low-accuracy outputs. This is achieved at a fraction of the computational cost, ultimately supporting a cost-effective integration of uncertainty measures into LLM-based EL workflows. The method offers a practical way to incorporate uncertainty estimation into EL workflows with limited computational overhead.


A model for efficient dynamical ranking in networks

Della Vecchia, Andrea, Neocosmos, Kibidi, Larremore, Daniel B., Moore, Cristopher, De Bacco, Caterina

arXiv.org Artificial Intelligence

We present a physics-inspired method for inferring dynamic rankings in directed temporal networks - networks in which each directed and timestamped edge reflects the outcome and timing of a pairwise interaction. The inferred ranking of each node is real-valued and varies in time as each new edge, encoding an outcome like a win or loss, raises or lowers the node's estimated strength or prestige, as is often observed in real scenarios including sequences of games, tournaments, or interactions in animal hierarchies. Our method works by solving a linear system of equations and requires only one parameter to be tuned. As a result, the corresponding algorithm is scalable and efficient. We test our method by evaluating its ability to predict interactions (edges' existence) and their outcomes (edges' directions) in a variety of applications, including both synthetic and real data. Our analysis shows that in many cases our method's performance is better than existing methods for predicting dynamic rankings and interaction outcomes.


Bisecting for selecting: using a Laplacian eigenmaps clustering approach to create the new European football Super League

Bond, A. J., Beggs, C. B.

arXiv.org Machine Learning

We use European football performance data to select teams to form the proposed European football Super League, using only unsupervised techniques. We first used random forest regression to select important variables predicting goal difference, which we used to calculate the Euclidian distances between teams. Creating a Laplacian eigenmap, we bisected the Fielder vector to identify the five major European football leagues' natural clusters. Our results showed how an unsupervised approach could successfully identify four clusters based on five basic performance metrics: shots, shots on target, shots conceded, possession, and pass success. The top two clusters identify those teams who dominate their respective leagues and are the best candidates to create the most competitive elite super league. Keywords: OR in sports; Selection; Unsupervised; Spectral clustering; Laplacian Eigenmap; Machine Learning 1. Introduction Operational research (OR) has a long history of using sport to explore operational insights and methodologies (see Wright, 2009 for a review).


Part #1: A statistical analysis on Serie A

#artificialintelligence

A concentrate of passion, hope and romanticism. Every year thousands and thousands of teams compete in their leagues with different purposes. Some of them are built to win the title. Others just want to not be relegated. But the answer to their hopes always relies on the same thing: numbers.


How Recommender systems works (Python code -- example film Recommender)

#artificialintelligence

Nowadays we hear very often the words "Recommender systems" and mainly it's because they are quite often used by companies for different purposes, such as to increase sales (items' suggestion while purchasing Amazon: user that have bought this as also bought this) or in suggestions to customers to give them a better customer experience (film suggestion Netflix) or also in advertising to target the right people based on preferences similarities. The recommender systems are basically systems that can recommend things to people based on what everybody else did. Here there is an example of film suggestion taken from an online course. I want to thank Frank Kane for this very useful course on Data Science and Machine Learning with Python. Here there is the course's link in case you would like to go deeper with Data Science.