Personal
Enterprise Architecture as a Dynamic Capability for Scalable and Sustainable Generative AI adoption: Bridging Innovation and Governance in Large Organisations
Generative Artificial Intelligence is a powerful new technology with the potential to boost innovation and reshape governance in many industries. Nevertheless, organisations face major challenges in scaling GenAI, including technology complexity, governance gaps and resource misalignments. This study explores how Enterprise Architecture Management can meet the complex requirements of GenAI adoption within large enterprises. Based on a systematic literature review and the qualitative analysis of 16 semi-structured interviews with experts, it examines the relationships between EAM, dynamic capabilities and GenAI adoption. The review identified key limitations in existing EA frameworks, particularly their inability to fully address the unique requirements of GenAI. The interviews, analysed using the Gioia methodology, revealed critical enablers and barriers to GenAI adoption across industries. The findings indicate that EAM, when theorised as sensing, seizing and transforming dynamic capabilities, can enhance GenAI adoption by improving strategic alignment, governance frameworks and organisational agility. However, the study also highlights the need to tailor EA frameworks to GenAI-specific challenges, including low data governance maturity and the balance between innovation and compliance. Several conceptual frameworks are proposed to guide EA leaders in aligning GenAI maturity with organisational readiness. The work contributes to academic understanding and industry practice by clarifying the role of EA in bridging innovation and governance in disruptive technology environments.
An empathic GPT-based chatbot to talk about mental disorders with Spanish teenagers
Mármol-Romero, Alba María, García-Vega, Manuel, García-Cumbreras, Miguel Ángel, Montejo-Ráez, Arturo
This paper presents a chatbot-based system to engage young Spanish people in the awareness of certain mental disorders through a self-disclosure technique. The study was carried out in a population of teenagers aged between 12 and 18 years. The dialogue engine mixes closed and open conversations, so certain controlled messages are sent to focus the chat on a specific disorder, which will change over time. Once a set of trial questions is answered, the system can initiate the conversation on the disorder under the focus according to the user's sensibility to that disorder, in an attempt to establish a more empathetic communication. Then, an open conversation based on the GPT-3 language model is initiated, allowing the user to express themselves with more freedom. The results show that these systems are of interest to young people and could help them become aware of certain mental disorders.
2025 AI Index Report
AI performance on demanding benchmarks continues to improve. Performance of advanced AI systems on new benchmarks introduced in 2023 has increased sharply. AI systems also made major strides in generating high-quality video. AI is increasingly embedded in everyday life. In 2023, the FDA (in the US) approved 223 AI-enabled medical devices, up from just six in 2015.
Facilitating Video Story Interaction with Multi-Agent Collaborative System
Zhang, Yiwen, Hao, Jianing, Wang, Zhan, Sheng, Hongling, Zeng, Wei
Video story interaction enables viewers to engage with and explore narrative content for personalized experiences. However, existing methods are limited to user selection, specially designed narratives, and lack customization. To address this, we propose an interactive system based on user intent. Our system uses a Vision Language Model (VLM) to enable machines to understand video stories, combining Retrieval-Augmented Generation (RAG) and a Multi-Agent System (MAS) to create evolving characters and scene experiences. It includes three stages: 1) Video story processing, utilizing VLM and prior knowledge to simulate human understanding of stories across three modalities. 2) Multi-space chat, creating growth-oriented characters through MAS interactions based on user queries and story stages. 3) Scene customization, expanding and visualizing various story scenes mentioned in dialogue. Applied to the Harry Potter series, our study shows the system effectively portrays emergent character social behavior and growth, enhancing the interactive experience in the video story world.
They Fell in Love Playing 'Minecraft.' Then the Game Became Their Wedding Venue
On a crisp Saturday in March, beneath a canopy of pixelated cherry blossoms, two avatars stood in front of a digital altar crafted from shimmering quartz blocks and flickering redstone torches. They were surrounded by a sprawling Minecraft village, complete with custom-coded NPCs reciting lore about the couple's decade-long digital courtship. Nearby, pixelated foxes darted between guests--each one logged in from across the world, dressed in custom skins as forest druids and rogue mages. After the vows (typed and read aloud on Discord), guests dispersed for side quests, scavenger hunts, and an enchanted maze culminating in a virtual fireworks show. This wasn't a rehearsal for an in-person wedding--this was the wedding.
How Russia and Ukraine Are Playing Trump's Blame Game
On May 9th, Vladimir Putin will oversee a parade in Moscow's Red Square, commemorating the Soviet Union's victory in the Second World War, an annual display of military bravado that, since Russia's full-scale invasion of Ukraine, in 2022, has taken on more explicit political undertones. The country's triumph over Nazism is presented as proof of its righteousness in the current war--and of it's role as a global power. Last year, as intercontinental ballistic missiles capable of carrying nuclear warheads rolled across the square, Putin linked the "radiant memory" of those who gave up their lives in the Second World War with "our brothers-in-arms who have fallen in the struggle against neo-Nazism and in the righteous fight for Russia"--that is, Russian soldiers killed in the current war in Ukraine. The Lede Reporting and commentary on what you need to know today. This year, the celebrations in Moscow serve another purpose: a way for Putin to show that he is not geopolitically isolated--China's Xi Jinping and Brazil's Luiz Inácio Lula da Silva are expected to attend.
Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback
Zhu, Shijing, Chen, Zhuang, Bi, Guanqun, Li, Binghang, Deng, Yaxi, Wan, Dazhen, Peng, Libiao, Xiao, Xiyao, Zhang, Rongsheng, Lv, Tangjie, Hu, Zhipeng, Li, FangFang, Huang, Minlie
Large language models (LLMs) have shown promise in providing scalable mental health support, while evaluating their counseling capability remains crucial to ensure both efficacy and safety. Existing evaluations are limited by the static assessment that focuses on knowledge tests, the single perspective that centers on user experience, and the open-loop framework that lacks actionable feedback. To address these issues, we propose Ψ-Arena, an interactive framework for comprehensive assessment and optimization of LLM-based counselors, featuring three key characteristics: (1) Realistic arena interactions that simulate real-world counseling through multi-stage dialogues with psychologically profiled NPC clients, (2) Tripartite evaluation that integrates assessments from the client, counselor, and supervisor perspectives, and (3) Closed-loop optimization that iteratively improves LLM counselors using diagnostic feedback. Experiments across eight state-of-the-art LLMs show significant performance variations in different real-world scenarios and evaluation perspectives. Moreover, reflection-based optimization results in up to a 141% improvement in counseling performance. We hope PsychoArena provides a foundational resource for advancing reliable and human-aligned LLM applications in mental healthcare.
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
Du, Baoxia, Du, Hongyang, Niyato, Dusit, Li, Ruidong
Task-oriented semantic communication has emerged as a fundamental approach for enhancing performance in various communication scenarios. While recent advances in Generative Artificial Intelligence (GenAI), such as Large Language Models (LLMs), have been applied to semantic communication designs, the potential of Large Multimodal Models (LMMs) remains largely unexplored. In this paper, we investigate an LMM-based vehicle AI assistant using a Large Language and Vision Assistant (LLaVA) and propose a task-oriented semantic communication framework to facilitate efficient interaction between users and cloud servers. To reduce computational demands and shorten response time, we optimize LLaVA's image slicing to selectively focus on areas of utmost interest to users. Additionally, we assess the importance of image patches by combining objective and subjective user attention, adjusting energy usage for transmitting semantic information. This strategy optimizes resource utilization, ensuring precise transmission of critical information. We construct a Visual Question Answering (VQA) dataset for traffic scenarios to evaluate effectiveness. Experimental results show that our semantic communication framework significantly increases accuracy in answering questions under the same channel conditions, performing particularly well in environments with poor Signal-to-Noise Ratios (SNR). Accuracy can be improved by 13.4% at an SNR of 12dB and 33.1% at 10dB, respectively.
AI Standardized Patient Improves Human Conversations in Advanced Cancer Care
Haut, Kurtis, Hasan, Masum, Carroll, Thomas, Epstein, Ronald, Sen, Taylan, Hoque, Ehsan
These are high-stakes conversations where clinicians must navigate weighty issues, where a poorly chosen word could have lasting consequences on a patient's final days and the memories their loved ones carry forward. Low-quality SIC has been associated with poor patient and family prognostic understanding [5], perceived lack of emotional support [6], lower quality healthcare outcomes and higher costs [7-13]. Communication with advanced-stage cancer patients specifically poses a variety of challenges, including: the volume and complexity of medical information, often fast-paced office visits, and the emotional burden of these life-changing conversations, for clinicians, patients, and their loved ones. Despite their extensive medical training, many physicians struggle to deliver difficult news effectively [14-16], often resulting in patient anxiety, misaligned treatment decisions, and reduced quality of care [17-19]. Also costly is the terms of expensive and potentially burdensome treatments as well as malpractice claims[20].
A Multimodal Framework for Explainable Evaluation of Soft Skills in Educational Environments
Guerrero-Sosa, Jared D. T., Romero, Francisco P., Menéndez-Domínguez, Víctor Hugo, Serrano-Guerrero, Jesus, Montoro-Montarroso, Andres, Olivas, Jose A.
In the rapidly evolving educational landscape, the unbiased assessment of soft skills is a significant challenge, particularly in higher education. This paper presents a fuzzy logic approach that employs a Granular Linguistic Model of Phenomena integrated with multimodal analysis to evaluate soft skills in undergraduate students. By leveraging computational perceptions, this approach enables a structured breakdown of complex soft skill expressions, capturing nuanced behaviours with high granularity and addressing their inherent uncertainties, thereby enhancing interpretability and reliability. Experiments were conducted with undergraduate students using a developed tool that assesses soft skills such as decision-making, communication, and creativity. This tool identifies and quantifies subtle aspects of human interaction, such as facial expressions and gesture recognition. The findings reveal that the framework effectively consolidates multiple data inputs to produce meaningful and consistent assessments of soft skills, showing that integrating multiple modalities into the evaluation process significantly improves the quality of soft skills scores, making the assessment work transparent and understandable to educational stakeholders.