Goto

Collaborating Authors

 Overview


Aesthetic Experience and Educational Value in Co-creating Art with Generative AI: Evidence from a Survey of Young Learners

arXiv.org Artificial Intelligence

This study investigates the aesthetic experience and educational value of collaborative artmaking with generative artificial intelligence (AI) among young learners and art students. Based on a survey of 112 participants, we examine how human creators renegotiate their roles, how conventional notions of originality are challenged, how the creative process is transformed, and how aesthetic judgment is formed in human-AI co-creation. Empirically, participants generally view AI as a partner that stimulates ideation and expands creative boundaries rather than a passive tool, while simultaneously voicing concerns about stylistic homogenization and the erosion of traditional authorship. Theoretically, we synthesize Dewey's aesthetics of experience, Ihde's postphenomenology, and actor-network theory (ANT) into a single analytical framework to unpack the dynamics between human creators and AI as a non-human actant. Findings indicate (i) a fluid subjectivity in which creators shift across multiple stances (director, dialogic partner, discoverer); (ii) an iterative, dialogic workflow (intent-generate-select-refine) that centers critical interpretation; and (iii) an educational value shift from technical skill training toward higher-order competencies such as critical judgment, cross-modal ideation, and reflexivity. We argue that arts education should cultivate a critical co-creation stance toward technology, guiding learners to collaborate with AI while preserving human distinctiveness in concept formation, judgment, and meaning-making.


Large Foundation Models for Trajectory Prediction in Autonomous Driving: A Comprehensive Survey

arXiv.org Artificial Intelligence

Trajectory prediction serves as a critical functionality in autonomous driving, enabling the anticipation of future motion paths for traffic participants such as vehicles and pedestrians, which is essential for driving safety. Although conventional deep learning methods have improved accuracy, they remain hindered by inherent limitations, including lack of interpretability, heavy reliance on large-scale annotated data, and weak generalization in long-tail scenarios. The rise of Large Foundation Models (LFMs) is transforming the research paradigm of trajectory prediction. This survey offers a systematic review of recent advances in LFMs, particularly Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) for trajectory prediction. By integrating linguistic and scene semantics, LFMs facilitate interpretable contextual reasoning, significantly enhancing prediction safety and generalization in complex environments. The article highlights three core methodologies: trajectory-language mapping, multimodal fusion, and constraint-based reasoning. It covers prediction tasks for both vehicles and pedestrians, evaluation metrics, and dataset analyses. Key challenges such as computational latency, data scarcity, and real-world robustness are discussed, along with future research directions including low-latency inference, causality-aware modeling, and motion foundation models.


SOH-KLSTM: A Hybrid Kolmogorov-Arnold Network and LSTM Model for Enhanced Lithium-Ion Battery Health Monitoring

arXiv.org Artificial Intelligence

Lithium (Li) batteries have emerged as a dominant energy storage solution due to their exceptional energy density, prolonged cycle life, fast charging capability, and adaptability across diverse applications, including electric vehicles, renewable energy systems, and portable electronics [1, 2, 3]. However, their performance inevitably degrades with time driven by repeated charge and discharge cycles, temperature fluctuations, and ageing effects [4, 5]. This degradation not only reduces battery efficiency and reliability but also poses significant safety risks, particularly in high-demand applications where performance consistency is critical [6], [7]. As a result, accurate estimation of the State of Health (SOH) is essential to ensure the longevity and safe operation of Li batteries. SOH is a key indicator of the remaining capacity and functional integrity of a battery relative to its initial state. It encompasses key variables such as voltage, current, temperature, and other factors that influence battery performance.


The 1st International Workshop on Disentangled Representation Learning for Controllable Generation (DRL4Real): Methods and Results

arXiv.org Artificial Intelligence

This paper reviews the 1st International Workshop on Disentangled Representation Learning for Controllable Generation (DRL4Real), held in conjunction with ICCV 2025. The workshop aimed to bridge the gap between the theoretical promise of Disentangled Representation Learning (DRL) and its application in realistic scenarios, moving beyond synthetic benchmarks. DRL4Real focused on evaluating DRL methods in practical applications such as controllable generation, exploring advancements in model robustness, interpretability, and generalization. The workshop accepted 9 papers covering a broad range of topics, including the integration of novel inductive biases (e.g., language), the application of diffusion models to DRL, 3D-aware disentanglement, and the expansion of DRL into specialized domains like autonomous driving and EEG analysis. This summary details the workshop's objectives, the themes of the accepted papers, and provides an overview of the methodologies proposed by the authors.


Towards Understanding Visual Grounding in Visual Language Models

arXiv.org Artificial Intelligence

Visual grounding refers to the ability of a model to identify a region within some visual input that matches a textual description. Consequently, a model equipped with visual grounding capabilities can target a wide range of applications in various domains, including referring expression comprehension, answering questions pertinent to fine-grained details in images or videos, caption visual context by explicitly referring to entities, as well as low and high-level control in simulated and real environments. In this survey paper, we review representative works across the key areas of research on modern general-purpose vision language models (VLMs). We first outline the importance of grounding in VLMs, then delineate the core components of the contemporary paradigm for developing grounded models, and examine their practical applications, including benchmarks and evaluation metrics for grounded multimodal generation. We also discuss the multifaceted interrelations among visual grounding, multimodal chain-of-thought, and reasoning in VLMs. Finally, we analyse the challenges inherent to visual grounding and suggest promising directions for future research.


Approaches to Responsible Governance of GenAI in Organizations

arXiv.org Artificial Intelligence

PEER-REVIEWED AND ACCEPTED IN IEEE- ISTAS 2025 The rapid evolution of Generative AI (GenAI) has introduced unprecedented opportunities while presenting complex challenges around ethics, accountability, and societal impact. This paper draws on a literature review, established governance frameworks, and industry roundtable discussions to identify core principles for integrating responsible GenAI governance into diverse organizational structures. Our objective is to provide actionable recommendations for a balanced, risk-based governance approach that enables both innovation and oversight. Findings emphasize the need for adaptable risk assessment tools, continuous monitoring practices, and cross-sector collaboration to establish trustworthy GenAI. These insights provide a structured foundation and Responsible GenAI Guide (ResAI) for organizations to align GenAI initiatives with ethical, legal, and operational best practices.


Cybersecurity in The Arab World: Technological and Socio-Political Dimensions

Communications of the ACM

Membership in ACM includes a subscription to Communications of the ACM (CACM), the computing industry's most trusted source for staying connected to the world of advanced computing. Interconnected systems have become the backbone of modern societies. However, the very same critical role played by these systems brings significant challenges: Securing interconnected systems is not merely a technological necessity, but a cornerstone for safeguarding the economic, political, and social stability of countries. While these challenges are global, the Arab World presents a unique landscape that warrants a nuanced exploration of both commonalities and peculiarities within the broader context of securing interconnected systems (see Figure for a brief summary of these challenges). Interconnected systems, including cyber-physical systems, often combine computational and physical processes. They include critical infrastructure such as power grids, transportation networks, and healthcare systems, alongside commercial and industrial applications.


State Algebra for Propositional Logic

arXiv.org Artificial Intelligence

This paper presents State Algebra, a novel framework designed to represent and manipulate propositional logic using algebraic methods. The framework is structured as a hierarchy of three representations: Set, Coordinate, and Row Decomposition. These representations anchor the system in well-known semantics while facilitating the computation using a powerful algebraic engine. A key aspect of State Algebra is its flexibility in representation. We show that although the default reduction of a state vector is not canonical, a unique canonical form can be obtained by applying a fixed variable order during the reduction process. This highlights a trade-off: by foregoing guaranteed canonicity, the framework gains increased flexibility, potentially leading to more compact representations of certain classes of problems. We explore how this framework provides tools to articulate both search-based and knowledge compilation algorithms and discuss its natural extension to probabilistic logic and Weighted Model Counting.


Large Language Models Meet Legal Artificial Intelligence: A Survey

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have significantly advanced the development of Legal Artificial Intelligence (Legal AI) in recent years, enhancing the efficiency and accuracy of legal tasks. To advance research and applications of LLM-based approaches in legal domain, this paper provides a comprehensive review of 16 legal LLMs series and 47 LLM-based frameworks for legal tasks, and also gather 15 benchmarks and 29 datasets to evaluate different legal capabilities. Additionally, we analyse the challenges and discuss future directions for LLM-based approaches in the legal domain. We hope this paper provides a systematic introduction for beginners and encourages future research in this field. Resources are available at https://github.com/ZhitianHou/LLMs4LegalAI.


Decentralising LLM Alignment: A Case for Context, Pluralism, and Participation

arXiv.org Artificial Intelligence

Large Language Models (LLMs) alignment methods have been credited with the commercial success of products like ChatGPT, given their role in steering LLMs towards user-friendly outputs. However, current alignment techniques predominantly mirror the normative preferences of a narrow reference group, effectively imposing their values on a wide user base. Drawing on theories of the power/knowledge nexus, this work argues that current alignment practices centralise control over knowledge production and governance within already influential institutions. To counter this, we propose decentralising alignment through three characteristics: context, pluralism, and participation. Furthermore, this paper demonstrates the critical importance of delineating the context-of-use when shaping alignment practices by grounding each of these features in concrete use cases. This work makes the following contributions: (1) highlighting the role of context, pluralism, and participation in decentralising alignment; (2) providing concrete examples to illustrate these strategies; and (3) demonstrating the nuanced requirements associated with applying alignment across different contexts of use.