South America
Are LLMs Empathetic to All? Investigating the Influence of Multi-Demographic Personas on a Model's Empathy
Malik, Ananya, Sabri, Nazanin, Karnaze, Melissa, Elsherief, Mai
Large Language Models' (LLMs) ability to converse naturally is empowered by their ability to empathetically understand and respond to their users. However, emotional experiences are shaped by demographic and cultural contexts. This raises an important question: Can LLMs demonstrate equitable empathy across diverse user groups? We propose a framework to investigate how LLMs' cognitive and affective empathy vary across user personas defined by intersecting demographic attributes. Our study introduces a novel intersectional analysis spanning 315 unique personas, constructed from combinations of age, culture, and gender, across four LLMs. Results show that attributes profoundly shape a model's empathetic responses. Interestingly, we see that adding multiple attributes at once can attenuate and reverse expected empathy patterns. We show that they broadly reflect real-world empathetic trends, with notable misalignments for certain groups, such as those from Confucian culture. We complement our quantitative findings with qualitative insights to uncover model behaviour patterns across different demographic groups. Our findings highlight the importance of designing empathy-aware LLMs that account for demographic diversity to promote more inclusive and equitable model behaviour.
ProfileXAI: User-Adaptive Explainable AI
Corrales, Gilber A., Sรกnchez, Carlos Andrรฉs Ferro, Tabares-Soto, Reinel, Sotelo, Jesรบs Alfonso Lรณpez, Ruz, Gonzalo A., Durรกn, Johan Sebastian Piรฑa
ProfileXAI is a model- and domain-agnostic framework that couples post-hoc explainers (SHAP, LIME, Anchor) with retrieval - augmented LLMs to produce explanations for different types of users. The system indexes a multimodal knowledge base, selects an explainer per instance via quantitative criteria, and generates grounded narratives with chat-enabled prompting. On Heart Disease and Thyroid Cancer datasets, we evaluate fidelity, robustness, parsimony, token use, and perceived quality. No explainer dominates: LIME achieves the best fidelity-robustness trade-off (Infidelity $\le 0.30$, $L<0.7$ on Heart Disease); Anchor yields the sparsest, low-token rules; SHAP attains the highest satisfaction ($\bar{x}=4.1$). Profile conditioning stabilizes tokens ($ฯ\le 13\%$) and maintains positive ratings across profiles ($\bar{x}\ge 3.7$, with domain experts at $3.77$), enabling efficient and trustworthy explanations.
A short methodological review on social robot navigation benchmarking
Chhetri, Pranup, Torrejon, Alejandro, Eslava, Sergio, Manso, Luis J.
Social Robot Navigation is the skill that allows robots to move efficiently in human-populated environments while ensuring safety, comfort, and trust. Unlike other areas of research, the scientific community has not yet achieved an agreement on how Social Robot Navigation should be benchmarked. This is notably important, as the lack of a de facto standard to benchmark Social Robot Navigation can hinder the progress of the field and may lead to contradicting conclusions. Motivated by this gap, we contribute with a short review focused exclusively on benchmarking trends in the period from January 2020 to July 2025. Of the 130 papers identified by our search using IEEE Xplore, we analysed the 85 papers that met the criteria of the review. This review addresses the metrics used in the literature for benchmarking purposes, the algorithms employed in such benchmarks, the use of human surveys for benchmarking, and how conclusions are drawn from the benchmarking results, when applicable.
LSPRAG: LSP-Guided RAG for Language-Agnostic Real-Time Unit Test Generation
Go, Gwihwan, Zhang, Quan, Zhou, Chijin, Wei, Zhao, Jiang, Yu
Automated unit test generation is essential for robust software development, yet existing approaches struggle to generalize across multiple programming languages and operate within real-time development. While Large Language Models (LLMs) offer a promising solution, their ability to generate high coverage test code depends on prompting a concise context of the focal method. Current solutions, such as Retrieval-Augmented Generation, either rely on imprecise similarity-based searches or demand the creation of costly, language-specific static analysis pipelines. To address this gap, we present LSPRAG, a framework for concise-context retrieval tailored for real-time, language-agnostic unit test generation. LSPRAG leverages off-the-shelf Language Server Protocol (LSP) back-ends to supply LLMs with precise symbol definitions and references in real time. By reusing mature LSP servers, LSPRAG provides an LLM with language-aware context retrieval, requiring minimal per-language engineering effort. We evaluated LSPRAG on open-source projects spanning Java, Go, and Python. Compared to the best performance of baselines, LSPRAG increased line coverage by up to 174.55% for Golang, 213.31% for Java, and 31.57% for Python.
Comparative Analysis of Object Detection Algorithms for Surface Defect Detection
This article compares the performance of six prominent object detection algorithms YOLOv11, RetinaNet, Fast R-CNN, YOLOv8, RT - DETR, and DETR on the NEU - DET surface defect detection dataset comprising images representing various metal surface defects, a crucial application in industrial quality control. Each model's performance was assessed regar ding detection accuracy, speed, and robustness across different defect types such as scratches, inclusions, and rolled-in scales. YOLOv11, a state-of-the-art real-time object detection algorithm, demonstrated superior performance compared to the other methods, achieving a remarkable 70% higher accuracy on average. This improvement can be attributed to YOLOv11's enhanced feature extraction capabilities and ability to process the entire image in a single forward pass, making it faster and more efficient in detecting smaller surface defects. Additionally, YOLOv11's architecture optimizations, such as improved anchor box generation and deeper convolutional layers, contributed to more precise localization of defects.
Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset
Zhang, Lily Hong, Milli, Smitha, Jusko, Karen, Smith, Jonathan, Amos, Brandon, Bouaziz, Wassim, Revel, Manon, Kussman, Jack, Sheynin, Yasha, Titus, Lisa, Radharapu, Bhaktipriya, Yu, Jane, Sarma, Vidya, Rose, Kris, Nickel, Maximilian
How can large language models (LLMs) serve users with varying preferences that may conflict across cultural, political, or other dimensions? To advance this challenge, this paper establishes four key results. First, we demonstrate, through a large-scale multilingual human study with representative samples from five countries (N=15,000), that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs. Second, we show that existing methods for preference dataset collection are insufficient for learning the diversity of human preferences even along two of the most salient dimensions of variability in global values, due to the underlying homogeneity of candidate responses. Third, we argue that this motivates the need for negatively-correlated sampling when generating candidate sets, and we show that simple prompt-based techniques for doing so significantly enhance the performance of alignment methods in learning heterogeneous preferences. Fourth, based on this novel candidate sampling approach, we collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date, featuring almost 200,000 comparisons from annotators spanning five countries. We hope that the Community Alignment dataset will be a valuable resource for improving the effectiveness of LLMs for a diverse global population.
ChatGPT shares data on how many users exhibit psychosis or suicidal thoughts
OpenAI has released new estimates of the number of ChatGPT users who exhibit possible signs of mental health emergencies, including mania, psychosis or suicidal thoughts. The company said that around 0.07% of ChatGPT users active in a given week exhibited such signs, adding that its artificial intelligence (AI) chatbot recognizes and responds to these sensitive conversations. While OpenAI maintains these cases are extremely rare, critics said even a small percentage may amount to hundreds of thousands of people, as ChatGPT recently reached 800 million weekly active users, per boss Sam Altman. As scrutiny mounts, the company said it built a network of experts around the world to advise it. Those experts include more than 170 psychiatrists, psychologists, and primary care physicians who have practiced in 60 countries, the company said. They have devised a series of responses in ChatGPT to encourage users to seek help in the real world, according to OpenAI.
Hamas hands over remains of captive as Israeli drone strike kills two
Can Israel annex the West Bank if the US says no? Will the US plan for Gaza fail? 'We survived the war, we may not survive the ceasefire' Who are the 95 healthcare workers held by Israel? Hamas has handed over the remains of another dead captive to Israel, hours after an Israeli drone attack in southern Gaza killed two Palestinians amid a fragile ceasefire. The Israeli military said on Monday that the Red Cross had taken custody of the coffin and was in the process of transporting it to the army's troops in Gaza. The remains of 16 had been handed over as of Monday.
UN slams Israel after attack on peacekeepers in Lebanon
Can Israel annex the West Bank if the US says no? Will the US plan for Gaza fail? 'We survived the war, we may not survive the ceasefire' Who are the 95 healthcare workers held by Israel? The United Nations and France have condemned an Israeli attack that hit UN peacekeeping troops in southern Lebanon. UN spokesperson Stephane Dujarric said on Monday that the previous day's attack on UNIFIL troops, which he said involved an Israeli drone dropping a grenade in the vicinity of a patrol, as well as a tank opening fire on peacekeepers near the border town of Kfar Kila, was "very, very dangerous". Israel has violated the truce on a near-daily basis.
Dolphins may be getting an Alzheimer's-like disease due to this neurotoxin
Environment Conservation Ocean Dolphins may be getting an Alzheimer's-like disease due to this neurotoxin The neurotoxins, found in algal blooms, primarily affect the body's nervous system. Breakthroughs, discoveries, and DIY tips sent every weekday. For marine biologists, dolphins are often viewed as sentinel species, or animals that shed light on the health of the ocean . Along with whales, porpoises, and other cetacean species, dolphins are one way that researchers know to sound the alarm about environmental hazards that might affect the ocean as a whole and potentially humans. In this context, researchers have connected neurotoxins from algal blooms to brain changes associated with an Alzheimer's-like disease in dolphins in Florida.