Expert Systems
AI's Euclid's Elements Moment: From Language Models to Computable Thought
Fang, Xinmin, Tao, Lingfeng, Li, Zhengxiong
This paper presents a comprehensive five-stage evolutionary framework for understanding the development of artificial intelligence, arguing that its trajectory mirrors the historical progression of human cognitive technologies. We posit that AI is advancing through distinct epochs, each defined by a revolutionary shift in its capacity for representation and reasoning, analogous to the inventions of cuneiform, the alphabet, grammar and logic, mathematical calculus, and formal logical systems. This "Geometry of Cognition" framework moves beyond mere metaphor to provide a systematic, cross-disciplinary model that not only explains AI's past architectural shifts-from expert systems to Transformers-but also charts a concrete and prescriptive path forward. Crucially, we demonstrate that this evolution is not merely linear but reflexive: as AI advances through these stages, the tools and insights it develops create a feedback loop that fundamentally reshapes its own underlying architecture. We are currently transitioning into a "Metalinguistic Moment," characterized by the emergence of self-reflective capabilities like Chain-of-Thought prompting and Constitutional AI. The subsequent stages, the "Mathematical Symbolism Moment" and the "Formal Logic System Moment," will be defined by the development of a computable calculus of thought, likely through neuro-symbolic architectures and program synthesis, culminating in provably aligned and reliable AI that reconstructs its own foundational representations. This work serves as the methodological capstone to our trilogy, which previously explored the economic drivers ("why") and cognitive nature ("what") of AI. Here, we address the "how," providing a theoretical foundation for future research and offering concrete, actionable strategies for startups and developers aiming to build the next generation of intelligent systems.
EditLord: Learning Code Transformation Rules for Code Editing
Li, Weichen, Jan, Albert, Ray, Baishakhi, Yang, Junfeng, Mao, Chengzhi, Pei, Kexin
Code editing is a foundational task in software development, where its effectiveness depends on whether it introduces desired code property changes without changing the original code's intended functionality. Existing approaches often formulate code editing as an implicit end-to-end task, omitting the fact that code-editing procedures inherently consist of discrete and explicit steps. Thus, they suffer from suboptimal performance and lack of robustness and generalization. We introduce EditLord, a code editing framework that makes the code transformation steps explicit. Our key insight is to employ a language model (LM) as an inductive learner to extract code editing rules from the training code pairs as concise meta-rule sets. Such rule sets will be manifested for each training sample to augment them for finetuning or assist in prompting- and iterative-based code editing. EditLord outperforms the state-of-the-art by an average of 22.7% in editing performance and 58.1% in robustness while achieving 20.2% higher functional correctness across critical software engineering and security applications, LM models, and editing modes.
Explainable Artificial Intelligence in Biomedical Image Analysis: A Comprehensive Survey
Dagnaw, Getamesay Haile, Zhu, Yanming, Maqsood, Muhammad Hassan, Yang, Wencheng, Dong, Xingshuai, Yin, Xuefei, Liew, Alan Wee-Chung
Explainable artificial intelligence (XAI) has become increasingly important in biomedical image analysis to promote transparency, trust, and clinical adoption of DL models. While several surveys have reviewed XAI techniques, they often lack a modality-aware perspective, overlook recent advances in multimodal and vision-language paradigms, and provide limited practical guidance. This survey addresses this gap through a comprehensive and structured synthesis of XAI methods tailored to biomedical image analysis.We systematically categorize XAI methods, analyzing their underlying principles, strengths, and limitations within biomedical contexts. A modality-centered taxonomy is proposed to align XAI methods with specific imaging types, highlighting the distinct interpretability challenges across modalities. We further examine the emerging role of multimodal learning and vision-language models in explainable biomedical AI, a topic largely underexplored in previous work. Our contributions also include a summary of widely used evaluation metrics and open-source frameworks, along with a critical discussion of persistent challenges and future directions. This survey offers a timely and in-depth foundation for advancing interpretable DL in biomedical image analysis.
Multi-level Mixture of Experts for Multimodal Entity Linking
Hu, Zhiwei, Gutiรฉrrez-Basulto, Vรญctor, Xiang, Zhiliang, Li, Ru, Pan, Jeff Z.
Multimodal Entity Linking (MEL) aims to link ambiguous mentions within multimodal contexts to associated entities in a multimodal knowledge base. Existing approaches to MEL introduce multimodal interaction and fusion mechanisms to bridge the modality gap and enable multi-grained semantic matching. However, they do not address two important problems: (i) mention ambiguity, i.e., the lack of semantic content caused by the brevity and omission of key information in the mention's textual context; (ii) dynamic selection of modal content, i.e., to dynamically distinguish the importance of different parts of modal information. To mitigate these issues, we propose a Multi-level Mixture of Experts (MMoE) model for MEL. MMoE has four components: (i) the description-aware mention enhancement module leverages large language models to identify the WikiData descriptions that best match a mention, considering the mention's textual context; (ii) the multimodal feature extraction module adopts multimodal feature encoders to obtain textual and visual embeddings for both mentions and entities; (iii)-(iv) the intra-level mixture of experts and inter-level mixture of experts modules apply a switch mixture of experts mechanism to dynamically and adaptively select features from relevant regions of information. Extensive experiments demonstrate the outstanding performance of MMoE compared to the state-of-the-art. MMoE's code is available at: https://github.com/zhiweihu1103/MEL-MMoE.
MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing
Clemens, Michael, Marasoviฤ, Ana
While AI presents significant potential for enhancing music mixing and mastering workflows, current research predominantly emphasizes end-to-end automation or generation, often overlooking the collaborative and instructional dimensions vital for co-creative processes. This gap leaves artists, particularly amateurs seeking to develop expertise, underserved. To bridge this, we introduce MixAssist, a novel audio-language dataset capturing the situated, multi-turn dialogue between expert and amateur music producers during collaborative mixing sessions. Comprising 431 audio-grounded conversational turns derived from 7 in-depth sessions involving 12 producers, MixAssist provides a unique resource for training and evaluating audio-language models that can comprehend and respond to the complexities of real-world music production dialogues. Our evaluations, including automated LLM-as-a-judge assessments and human expert comparisons, demonstrate that fine-tuning models such as Qwen-Audio on MixAssist can yield promising results, with Qwen significantly outperforming other tested models in generating helpful, contextually relevant mixing advice. By focusing on co-creative instruction grounded in audio context, MixAssist enables the development of intelligent AI assistants designed to support and augment the creative process in music mixing.
Enhancing the Interpretability of Rule-based Explanations through Information Retrieval
Umbrico, Alessandro, Bologna, Guido, Coraci, Luca, Fracasso, Francesca, Gola, Silvia, Cortellessa, Gabriella
The lack of transparency of data-driven Artificial Intelligence techniques limits their interpretability and acceptance into healthcare decision-making processes. We propose an attribution-based approach to improve the interpretability of Explainable AI-based predictions in the specific context of arm lymphedema's risk assessment after lymph nodal radiotherapy in breast cancer. The proposed method performs a statistical analysis of the attributes in the rule-based prediction model using standard metrics from Information Retrieval techniques. This analysis computes the relevance of each attribute to the prediction and provides users with interpretable information about the impact of risk factors. The results of a user study that compared the output generated by the proposed approach with the raw output of the Explainable AI model suggested higher levels of interpretability and usefulness in the context of predicting lymphedema risk.
Towards Unified Neurosymbolic Reasoning on Knowledge Graphs
Lin, Qika, Xu, Fangzhi, Lu, Hao, He, Kai, Mao, Rui, Liu, Jun, Cambria, Erik, Feng, Mengling
Knowledge Graph (KG) reasoning has received significant attention in the fields of artificial intelligence and knowledge engineering, owing to its ability to autonomously deduce new knowledge and consequently enhance the availability and precision of downstream applications. However, current methods predominantly concentrate on a single form of neural or symbolic reasoning, failing to effectively integrate the inherent strengths of both approaches. Furthermore, the current prevalent methods primarily focus on addressing a single reasoning scenario, presenting limitations in meeting the diverse demands of real-world reasoning tasks. Unifying the neural and symbolic methods, as well as diverse reasoning scenarios in one model is challenging as there is a natural representation gap between symbolic rules and neural networks, and diverse scenarios exhibit distinct knowledge structures and specific reasoning objectives. To address these issues, we propose a unified neurosymbolic reasoning framework, namely Tunsr, for KG reasoning. Tunsr first introduces a consistent structure of reasoning graph that starts from the query entity and constantly expands subsequent nodes by iteratively searching posterior neighbors. Based on it, a forward logic message-passing mechanism is proposed to update both the propositional representations and attentions, as well as first-order logic (FOL) representations and attentions of each node. In this way, Tunsr conducts the transformation of merging multiple rules by merging possible relations at each step. Finally, the FARI algorithm is proposed to induce FOL rules by constantly performing attention calculations over the reasoning graph. Extensive experimental results on 19 datasets of four reasoning scenarios (transductive, inductive, interpolation, and extrapolation) demonstrate the effectiveness of Tunsr.
SpiritRAG: A Q&A System for Religion and Spirituality in the United Nations Archive
Gao, Yingqiang, Winiger, Fabian, Montjourides, Patrick, Shaitarova, Anastassia, Gu, Nianlong, Peng-Keller, Simon, Schneider, Gerold
Religion and spirituality (R/S) are complex and highly domain-dependent concepts which have long confounded researchers and policymakers. Due to their context-specificity, R/S are difficult to operationalize in conventional archival search strategies, particularly when datasets are very large, poorly accessible, and marked by information noise. As a result, considerable time investments and specialist knowledge is often needed to extract actionable insights related to R/S from general archival sources, increasing reliance on published literature and manual desk reviews. To address this challenge, we present SpiritRAG, an interactive Question Answering (Q&A) system based on Retrieval-Augmented Generation (RAG). Built using 7,500 United Nations (UN) resolution documents related to R/S in the domains of health and education, SpiritRAG allows researchers and policymakers to conduct complex, context-sensitive database searches of very large datasets using an easily accessible, chat-based web interface. SpiritRAG is lightweight to deploy and leverages both UN documents and user provided documents as source material. A pilot test and evaluation with domain experts on 100 manually composed questions demonstrates the practical value and usefulness of SpiritRAG.
SEZ-HARN: Self-Explainable Zero-shot Human Activity Recognition Network
De Silva, Devin Y., Wickramanayake, Sandareka, Meedeniya, Dulani, Rasnayaka, Sanka
Human Activity Recognition (HAR), which uses data from Inertial Measurement Unit (IMU) sensors, has many practical applications in healthcare and assisted living environments. However, its use in real-world scenarios has been limited by the lack of comprehensive IMU-based HAR datasets that cover a wide range of activities and the lack of transparency in existing HAR models. Zero-shot HAR (ZS-HAR) overcomes the data limitations, but current models struggle to explain their decisions, making them less transparent. This paper introduces a novel IMU-based ZS-HAR model called the Self-Explainable Zero-shot Human Activity Recognition Network (SEZ-HARN). It can recognize activities not encountered during training and provide skeleton videos to explain its decision-making process. We evaluate the effectiveness of the proposed SEZ-HARN on four benchmark datasets PAMAP2, DaLiAc, HTD-MHAD and MHealth and compare its performance against three state-of-the-art black-box ZS-HAR models. The experiment results demonstrate that SEZ-HARN produces realistic and understandable explanations while achieving competitive Zero-shot recognition accuracy. SEZ-HARN achieves a Zero-shot prediction accuracy within 3\% of the best-performing black-box model on PAMAP2 while maintaining comparable performance on the other three datasets.
OpenAg: Democratizing Agricultural Intelligence
Thudumu, Srikanth, Fisher, Jason
Agriculture is undergoing a major transformation driven by artificial intelligence (AI), machine learning, and knowledge representation technologies. However, current agricultural intelligence systems often lack contextual understanding, explainability, and adaptability, especially for smallholder farmers with limited resources. General-purpose large language models (LLMs), while powerful, typically lack the domain-specific knowledge and contextual reasoning needed for practical decision support in farming. They tend to produce recommendations that are too generic or unrealistic for real-world applications. To address these challenges, we present OpenAg, a comprehensive framework designed to advance agricultural artificial general intelligence (AGI). OpenAg combines domain-specific foundation models, neural knowledge graphs, multi-agent reasoning, causal explainability, and adaptive transfer learning to deliver context-aware, explainable, and actionable insights. The system includes: (i) a unified agricultural knowledge base that integrates scientific literature, sensor data, and farmer-generated knowledge; (ii) a neural agricultural knowledge graph for structured reasoning and inference; (iii) an adaptive multi-agent reasoning system where AI agents specialize and collaborate across agricultural domains; and (iv) a causal transparency mechanism that ensures AI recommendations are interpretable, scientifically grounded, and aligned with real-world constraints. OpenAg aims to bridge the gap between scientific knowledge and the tacit expertise of experienced farmers to support scalable and locally relevant agricultural decision-making.