Pappalardo, Luca
A linguistic analysis of undesirable outcomes in the era of generative AI
Gambetta, Daniele, Gezici, Gizem, Giannotti, Fosca, Pedreschi, Dino, Knott, Alistair, Pappalardo, Luca
Recent research has focused on the medium and long-term impacts of generative AI, posing scientific and societal challenges mainly due to the detection and reliability of machine-generated information, which is projected to form the major content on the Web soon. Prior studies show that LLMs exhibit a lower performance in generation tasks (model collapse) as they undergo a fine-tuning process across multiple generations on their own generated content (self-consuming loop). In this paper, we present a comprehensive simulation framework built upon the chat version of LLama2, focusing particularly on the linguistic aspects of the generated content, which has not been fully examined in existing studies. Our results show that the model produces less lexical rich content across generations, reducing diversity. The lexical richness has been measured using the linguistic measures of entropy and TTR as well as calculating the POSTags frequency. The generated content has also been examined with an $n$-gram analysis, which takes into account the word order, and semantic networks, which consider the relation between different words. These findings suggest that the model collapse occurs not only by decreasing the content diversity but also by distorting the underlying linguistic patterns of the generated text, which both highlight the critical importance of carefully choosing and curating the initial input text, which can alleviate the model collapse problem. Furthermore, we conduct a qualitative analysis of the fine-tuned models of the pipeline to compare their performances on generic NLP tasks to the original model. We find that autophagy transforms the initial model into a more creative, doubtful and confused one, which might provide inaccurate answers and include conspiracy theories in the model responses, spreading false and biased information on the Web.
Geospatial Road Cycling Race Results Data Set
Janssens, Bram, Pappalardo, Luca, De Bock, Jelle, Bogaert, Matthias, Verstockt, Steven
The field of cycling analytics has only recently started to develop due to limited access to open data sources. Accordingly, research and data sources are very divergent, with large differences in information used across studies. To improve this, and facilitate further research in the field, we propose the publication of a data set which links thousands of professional race results from the period 2017-2023 to detailed geographic information about the courses, an essential aspect in road cycling analytics. Initial use cases are proposed, showcasing the usefulness in linking these two data sources.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions
Pappalardo, Luca, Ferragina, Emanuele, Citraro, Salvatore, Cornacchia, Giuliano, Nanni, Mirco, Rossetti, Giulio, Gezici, Gizem, Giannotti, Fosca, Lalli, Margherita, Gambetta, Daniele, Mauro, Giovanni, Morini, Virginia, Pansanella, Valentina, Pedreschi, Dino
Recommendation systems and assistants (from now on, recommenders) - algorithms suggesting items or providing solutions based on users' preferences or requests [99, 105, 141, 166] - influence through online platforms most actions of our day to day life. For example, recommendations on social media suggest new social connections, those on online retail platforms guide users' product choices, navigation services offer routes to desired destinations, and generative AI platforms produce content based on users' requests. Unlike other AI tools, such as medical diagnostic support systems, robotic vision systems, or autonomous driving, which assist in specific tasks or functions, recommenders are ubiquitous in online platforms, shaping our decisions and interactions instantly and profoundly. The influence recommenders exert on users' behaviour may generate long-lasting and often unintended effects on human-AI ecosystems [131], such as amplifying political radicalisation processes [82], increasing CO2 emissions in the environment [36] and amplifying inequality, biases and discriminations [120]. The interaction between humans and recommenders has been examined in various fields using different nomenclatures, research methods and datasets, often producing incongruent findings.
Popularity-based Alternative Routing
Cornacchia, Giuliano, Lemma, Ludovico, Pappalardo, Luca
Alternative routing is crucial to minimize the environmental impact of urban transportation while enhancing road network efficiency and reducing traffic congestion. Existing methods neglect information about road popularity, possibly leading to unintended consequences such as increasing emissions and congestion. This paper introduces Polaris, an alternative routing algorithm that exploits road popularity to optimize traffic distribution and reduce CO2 emissions. Polaris leverages the novel concept of K-road layers, which mitigates the feedback loop effect where redirecting vehicles to less popular roads could increase their popularity in the future. We conduct experiments in three cities to evaluate Polaris against state-of-the-art alternative routing algorithms. Our results demonstrate that Polaris significantly reduces the overuse of highly popular road edges and traversed regulated intersections, showcasing its ability to generate efficient routes and distribute traffic more evenly. Furthermore, Polaris achieves substantial CO2 reductions, outperforming existing alternative routing strategies. Finally, we compare Polaris to an algorithm that coordinates vehicles centrally to distribute them more evenly on the road network. Our findings reveal that Polaris performs comparably well, even with much less information, highlighting its potential as an efficient and sustainable solution for urban traffic management.
Social AI and the Challenges of the Human-AI Ecosystem
Pedreschi, Dino, Pappalardo, Luca, Baeza-Yates, Ricardo, Barabasi, Albert-Laszlo, Dignum, Frank, Dignum, Virginia, Eliassi-Rad, Tina, Giannotti, Fosca, Kertesz, Janos, Knott, Alistair, Ioannidis, Yannis, Lukowicz, Paul, Passarella, Andrea, Pentland, Alex Sandy, Shawe-Taylor, John, Vespignani, Alessandro
The rise of large-scale socio-technical systems in which humans interact with artificial intelligence (AI) systems (including assistants and recommenders, in short AIs) multiplies the opportunity for the emergence of collective phenomena and tipping points, with unexpected, possibly unintended, consequences. For example, navigation systems' suggestions may create chaos if too many drivers are directed on the same route, and personalised recommendations on social media may amplify polarisation, filter bubbles, and radicalisation. On the other hand, we may learn how to foster the "wisdom of crowds" and collective action effects to face social and environmental challenges. In order to understand the impact of AI on socio-technical systems and design next-generation AIs that team with humans to help overcome societal problems rather than exacerbate them, we propose to build the foundations of Social AI at the intersection of Complex Systems, Network Science and AI. In this perspective paper, we discuss the main open questions in Social AI, outlining possible technical and scientific challenges and suggesting research avenues.
One-Shot Traffic Assignment with Forward-Looking Penalization
Cornacchia, Giuliano, Nanni, Mirco, Pappalardo, Luca
Traffic assignment (TA) is crucial in optimizing transportation systems and consists in efficiently assigning routes to a collection of trips. Existing TA algorithms often do not adequately consider real-time traffic conditions, resulting in inefficient route assignments. This paper introduces METIS, a cooperative, one-shot TA algorithm that combines alternative routing with edge penalization and informed route scoring. We conduct experiments in several cities to evaluate the performance of METIS against state-of-the-art one-shot methods. Compared to the best baseline, METIS significantly reduces CO2 emissions by 18% in Milan, 28\% in Florence, and 46% in Rome, improving trip distribution considerably while still having low computational time. Our study proposes METIS as a promising solution for optimizing TA and urban transportation systems.
Generating Synthetic Mobility Networks with Generative Adversarial Networks
Mauro, Giovanni, Luca, Massimiliano, Longa, Antonio, Lepri, Bruno, Pappalardo, Luca
The increasingly crucial role of human displacements in complex societal phenomena, such as traffic congestion, segregation, and the diffusion of epidemics, is attracting the interest of scientists from several disciplines. In this article, we address mobility network generation, i.e., generating a city's entire mobility network, a weighted directed graph in which nodes are geographic locations and weighted edges represent people's movements between those locations, thus describing the entire mobility set flows within a city. Our solution is MoGAN, a model based on Generative Adversarial Networks (GANs) to generate realistic mobility networks. We conduct extensive experiments on public datasets of bike and taxi rides to show that MoGAN outperforms the classical Gravity and Radiation models regarding the realism of the generated networks. Our model can be used for data augmentation and performing simulations and what-if analysis.
How Routing Strategies Impact Urban Emissions
Cornacchia, Giuliano, Böhm, Matteo, Mauro, Giovanni, Nanni, Mirco, Pedreschi, Dino, Pappalardo, Luca
Navigation apps use routing algorithms to suggest the best path to reach a user's desired destination. Although undoubtedly useful, navigation apps' impact on the urban environment (e.g., carbon dioxide emissions and population exposure to pollution) is still largely unclear. In this work, we design a simulation framework to assess the impact of routing algorithms on carbon dioxide emissions within an urban environment. Using APIs from TomTom and OpenStreetMap, we find that settings in which either all vehicles or none of them follow a navigation app's suggestion lead to the worst impact in terms of CO2 emissions. In contrast, when just a portion (around half) of vehicles follow these suggestions, and some degree of randomness is added to the remaining vehicles' paths, we observe a reduction in the overall CO2 emissions over the road network. Our work is a first step towards designing next-generation routing principles that may increase urban well-being while satisfying individual needs.
Coach2vec: autoencoding the playing style of soccer coaches
Cintia, Paolo, Pappalardo, Luca
Capturing the playing style of professional soccer coaches is a complex, and yet barely explored, task in sports analytics. Nowadays, the availability of digital data describing every relevant spatio-temporal aspect of soccer matches, allows for capturing and analyzing the playing style of players, teams, and coaches in an automatic way. In this paper, we present coach2vec, a workflow to capture the playing style of professional coaches using match event streams and artificial intelligence. Coach2vec extracts ball possessions from each match, clusters them based on their similarity, and reconstructs the typical ball possessions of coaches. Then, it uses an autoencoder, a type of artificial neural network, to obtain a concise representation (encoding) of the playing style of each coach. Our experiments, conducted on soccer-logs describing the last four seasons of the Italian first division, reveal interesting similarities between prominent coaches, paving the road to the simulation of playing styles and the quantitative comparison of professional coaches.
Understanding peacefulness through the world news
Voukelatou, Vasiliki, Miliou, Ioanna, Giannotti, Fosca, Pappalardo, Luca
Peacefulness is a principal dimension of well-being for all humankind and is the way out of inequity and every single form of violence. Thus, its measurement has lately drawn the attention of researchers and policy-makers. During the last years, novel digital data streams have drastically changed the research in this field. In the current study, we exploit information extracted from Global Data on Events, Location, and Tone (GDELT) digital news database, to capture peacefulness through the Global Peace Index (GPI). Applying predictive machine learning models, we demonstrate that news media attention from GDELT can be used as a proxy for measuring GPI at a monthly level. Additionally, we use the SHAP methodology to obtain the most important variables that drive the predictions. This analysis highlights each country's profile and provides explanations for the predictions overall, and particularly for the errors and the events that drive these errors. We believe that digital data exploited by Social Good researchers, policy-makers, and peace-builders, with data science tools as powerful as machine learning, could contribute to maximize the societal benefits and minimize the risks to peacefulness.