Goto

Collaborating Authors

 madrid


#AIES2025 social media round-up

AIHub

This week saw researchers gather in Madrid at the eighth AAAI / ACM Conference on Artificial Intelligence, Ethics, and Society (AIES) . As well as keynote talks, panels and poster sessions, the organisers experimented with a slightly different format for the contributed talks. All speakers in a session gave their talks, then contributed in a joint discussion on common themes, before the floor was opened to questions from the audience. We cast an eye over social media platforms to find out what participants got up to at the event. Find out what's on the agenda at #AIES2025 next week.


Stochastic Streets: A Walk Through Random LLM Address Generation in four European Cities

Fu, Tairan, Campo-Nazareno, David, Coronado-Blázquez, Javier, Conde, Javier, Reviriego, Pedro, Lombardi, Fabrizio

arXiv.org Artificial Intelligence

Northeastern University, Boston, US A Abstract: Large Language Models (LLMs) are capable of solving complex math problems or answer difficult questions on almost any topic, but can they generate random street addresses for European cities? Large Language Models (LLMs) have shown impressive performance across a wide range of task s, such as answering questions on virtually any topic. However, there remain areas in wh ich their performance falls short, for example, seemingly simple tasks like counting the letters in a word. In this column, we explore another such challenge: generatin g random street addresses for four major European cities. Our results reveal that LLMs exhibit strong biases, repeatedly selecting a limited set of streets and, for some models, even specific street numbers. Surprisingly, so me of the more prominent and ico nic streets are not selected by the models and the most frequent numbers in the responses lack any clear significance.


Multimodal Proposal for an AI-Based Tool to Increase Cross-Assessment of Messages

Castro, Alejandro Álvarez, Ordieres-Meré, Joaquín

arXiv.org Artificial Intelligence

Earnings calls represent a uniquely rich and semi-structured source of financial communication, blending scripted managerial commentary with unscripted analyst dialogue. Although recent advances in financial sentiment analysis have integrated multi-modal signals, such as textual content and vocal tone, most systems rely on flat document-level or sentence-level models, failing to capture the layered discourse structure of these interactions. This paper introduces a novel multi-modal framework designed to generate semantically rich and structurally aware embeddings of earnings calls, by encoding them as hierarchical discourse trees. Each node, comprising either a monologue or a question-answer pair, is enriched with emotional signals derived from text, audio, and video, as well as structured metadata including coherence scores, topic labels, and answer coverage assessments. A two-stage transformer architecture is proposed: the first encodes multi-modal content and discourse metadata at the node level using contrastive learning, while the second synthesizes a global embedding for the entire conference. Experimental results reveal that the resulting embeddings form stable, semantically meaningful representations that reflect affective tone, structural logic, and thematic alignment. Beyond financial reporting, the proposed system generalizes to other high-stakes unscripted communicative domains such as tele-medicine, education, and political discourse, offering a robust and explainable approach to multi-modal discourse representation. This approach offers practical utility for downstream tasks such as financial forecasting and discourse evaluation, while also providing a generalizable method applicable to other domains involving high-stakes communication.


Reducing Street Parking Search Time via Smart Assignment Strategies

Hemmatpour, Behafarid, Dogani, Javad, Laoutaris, Nikolaos

arXiv.org Artificial Intelligence

In dense metropolitan areas, searching for street parking adds to traffic congestion. Like many other problems, real-time assistants based on mobile phones have been proposed, but their effectiveness is understudied. This work quantifies how varying levels of user coordination and information availability through such apps impact search time and the probability of finding street parking. Through a data-driven simulation of Madrid's street parking ecosystem, we analyze four distinct strategies: uncoordinated search (Unc-Agn), coordinated parking without awareness of non-users (Cord-Agn), an idealized oracle system that knows the positions of all non-users (Cord-Oracle), and our novel/practical Cord-Approx strategy that estimates non-users' behavior probabilistically. The Cord-Approx strategy, instead of requiring knowledge of how close non-users are to a certain spot in order to decide whether to navigate toward it, uses past occupancy distributions to elongate physical distances between system users and alternative parking spots, and then solves a Hungarian matching problem to dispatch accordingly. In high-fidelity simulations of Madrid's parking network with real traffic data, users of Cord-Approx averaged 6.69 minutes to find parking, compared to 19.98 minutes for non-users without an app. A zone-level snapshot shows that Cord-Approx reduces search time for system users by 72% (range = 67-76%) in central hubs, and up to 73% in residential areas, relative to non-users.


LengClaro2023: A Dataset of Administrative Texts in Spanish with Plain Language adaptations

Agüera-Marco, Belén, Gonzalez-Dios, Itziar

arXiv.org Artificial Intelligence

In this work, we present LengClaro2023, a dataset of legal-administrative texts in Spanish. Based on the most frequently used procedures from the Spanish Social Security website, we have created for each text two simplified equivalents. The first version follows the recommendations provided by arText claro. The second version incorporates additional recommendations from plain language guidelines to explore further potential improvements in the system. The linguistic resource created in this work can be used for evaluating automatic text simplification (ATS) systems in Spanish.


Real-time Spatial Retrieval Augmented Generation for Urban Environments

Campo, David Nazareno, Conde, Javier, Alonso, Álvaro, Huecas, Gabriel, Salvachúa, Joaquín, Reviriego, Pedro

arXiv.org Artificial Intelligence

The proliferation of Generative Artificial Ingelligence (AI), especially Large Language Models, presents transformative opportunities for urban applications through Urban Foundation Models. However, base models face limitations, as they only contain the knowledge available at the time of training, and updating them is both time-consuming and costly. Retrieval Augmented Generation (RAG) has emerged in the literature as the preferred approach for injecting contextual information into Foundation Models. It prevails over techniques such as fine-tuning, which are less effective in dynamic, real-time scenarios like those found in urban environments. However, traditional RAG architectures, based on semantic databases, knowledge graphs, structured data, or AI-powered web searches, do not fully meet the demands of urban contexts. Urban environments are complex systems characterized by large volumes of interconnected data, frequent updates, real-time processing requirements, security needs, and strong links to the physical world. This work proposes a real-time spatial RAG architecture that defines the necessary components for the effective integration of generative AI into cities, leveraging temporal and spatial filtering capabilities through linked data. The proposed architecture is implemented using FIWARE, an ecosystem of software components to develop smart city solutions and digital twins. The design and implementation are demonstrated through the use case of a tourism assistant in the city of Madrid. The use case serves to validate the correct integration of Foundation Models through the proposed RAG architecture.


Speed and Conversational Large Language Models: Not All Is About Tokens per Second

Conde, Javier, González, Miguel, Reviriego, Pedro, Gao, Zhen, Liu, Shanshan, Lombardi, Fabrizio

arXiv.org Artificial Intelligence

Unfortunately, these models are closed and can only be accessed through the user interfaces, tools or application programming interfaces provided by the companies that developed the models. Their parameters and implementation details are not publicly available and even if they were, their huge size would make their execution on commodity computing devices unfeasible. A different approach has been taken by some large companies such as Meta, i.e. the code as well as the parameters or weights of LLMs such as LLaMa


Can ChatGPT Learn to Count Letters?

Conde, Javier, Martínez, Gonzalo, Reviriego, Pedro, Gao, Zhen, Liu, Shanshan, Lombardi, Fabrizio

arXiv.org Artificial Intelligence

In this paper we explore if ChatGPT can learn to count letters. Since the introduction of ChatGPT two years ago, Large Language Model (LLM) based tools have shown impressive capabilities to solve mathematical problems or to answer questions on almost any topic [1]. In fact, evaluation benchmarks have to be revised frequently to make then harder as LLM performance improves continuously [2]. The development of LLMs has also been hectic with new models presented by large companies such as Google with Gemini or Gemma, Meta with Llama or x.AI with Grok. OpenAI has also released newer versions and improvements of their Generative Pre-trained Transformer (GPT) family such as GPT4 [3] and its variants GPT4o and GPT4o1. Those foundational models are then adapted to answer questions or interact with users and complemented with other functionalities to implement conversational tools like ChatGPT. Despite these astonishing results, there are some simple tasks that LLMs struggle with, for example arithmetic operations [4] or even counting the occurrences of a given letter in a word. For example, many LLMs failed to count the number of "r" in strawberry


Understanding the Impact of Artificial Intelligence in Academic Writing: Metadata to the Rescue

Conde, Javier, Reviriego, Pedro, Salvachúa, Joaquín, Martínez, Gonzalo, Hernández, José Alberto, Lombardi, Fabrizio

arXiv.org Artificial Intelligence

This enables the identification of the text for which AI assistance has been used. How AI was used AI tools can be used for many different tasks: summarizing, translation, paraphrasing, finding related work and citations, etc. So, it is important to have information on how AI tools were used in the paper. For example, we can encode in the metadata that GPT -4 (so the "which") was used to summarize (the "how") and write the abstract (the "where").UNDERSTANDING THE IMP ACT OF AI IN ACADEMIC WRITING Let us consider now that we have a large corpus of papers and we want to know how many of them have used AI to summarize the abstract. Without metadata, all papers look the same (Figure 2, left), so we have to extract the text and either try to detect the use of AI in the abstract or find a disclosure of the authors that states the use of AI in the abstract. Instead if the proposed metadata has been added to the paper, we can just look at the how (summarizing) and where (abstract) to find the papers. The papers are now marked and can be easily identified (Figure 2, right). The metadata can be used to analyze many aspects of the use of AI in academic writing, for example, we can analyze: 1) The adoption of the different AI tools and their variations over time.


An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions

Fernandez-Bes, Jesus, Cid-Sueiro, Jesus, Marques, Antonio G.

arXiv.org Artificial Intelligence

In this paper, we propose a novel censoring policy for energy-efficient transmissions in energy-harvesting sensors. The problem is formulated as an infinite-horizon Markov Decision Process (MDP). The objective to be optimized is the expected sum of the importance (utility) of all transmitted messages. Assuming that such importance can be evaluated at the transmitting node, we show that, under certain conditions on the battery model, the optimal censoring policy is a threshold function on the importance value. Specifically, messages are transmitted only if their importance is above a threshold whose value depends on the battery level. Exploiting this property, we propose a model-based stochastic scheme that approximates the optimal solution, with less computational complexity and faster convergence speed than a conventional Q-learning algorithm. Numerical experiments in single-hop and multi-hop networks confirm the analytical advantages of the proposed scheme.