Goto

Collaborating Authors

 sas


Detecting Hallucinations in Graph Retrieval-Augmented Generation via Attention Patterns and Semantic Alignment

Li, Shanghao, Han, Jinda, Wang, Yibo, Zhu, Yuanjie, Song, Zihe, He, Langzhou, Alghythee, Kenan Kamel A, Yu, Philip S.

arXiv.org Artificial Intelligence

Graph-based Retrieval-Augmented Generation (GraphRAG) enhances Large Language Models (LLMs) by incorporating external knowledge from linearized subgraphs retrieved from knowledge graphs. However, LLMs struggle to interpret the relational and topological information in these inputs, resulting in hallucinations that are inconsistent with the retrieved knowledge. To analyze how LLMs attend to and retain structured knowledge during generation, we propose two lightweight interpretability metrics: Path Reliance Degree (PRD), which measures over-reliance on shortest-path triples, and Semantic Alignment Score (SAS), which assesses how well the model's internal representations align with the retrieved knowledge. Through empirical analysis on a knowledge-based QA task, we identify failure patterns associated with over-reliance on salient paths and weak semantic grounding, as indicated by high PRD and low SAS scores. We further develop a lightweight post-hoc hallucination detector, Graph Grounding and Alignment (GGA), which outperforms strong semantic and confidence-based baselines across AUC and F1. By grounding hallucination analysis in mechanistic interpretability, our work offers insights into how structural limitations in LLMs contribute to hallucinations, informing the design of more reliable GraphRAG systems in the future.


Decentralized Multi-Agent System with Trust-Aware Communication

Ding, Yepeng, Twabi, Ahmed, Yu, Junwei, Zhang, Lingfeng, Kondo, Tohru, Sato, Hiroyuki

arXiv.org Artificial Intelligence

Abstract--The emergence of Large Language Models (LLMs) is rapidly accelerating the development of autonomous multi-agent systems (MAS), paving the way for the Internet of Agents. However, traditional centralized MAS architectures present significant challenges, including single points of failure, vulnerability to censorship, inherent scalability limitations, and critical trust issues. We propose a novel Decentralized Multi-Agent System (DMAS) architecture designed to overcome these fundamental problems by enabling trust-aware, scalable, and censorship-resistant interactions among autonomous agents. Our DMAS features a decentralized agent runtime underpinned by a blockchain-based architecture. We formalize a trust-aware communication protocol that leverages cryptographic primitives and on-chain operations to provide security properties: verifiable interaction cycles, communication integrity, authenticity, non-repudiation, and conditional confidentiality, which we further substantiate through a comprehensive security analysis. The rapid advancements in Large Language Models (LLMs) [1]-[4] have opened unprecedented avenues for creating highly autonomous and intelligent agents. These LLM-augmented agents possess remarkable capabilities in understanding natural language, performing complex reasoning, planning intricate sequences of actions, and engaging in sophisticated communication.


SAS: Simulated Attention Score

Zheng, Chuanyang, Sun, Jiankai, Gao, Yihang, Wang, Yuehao, Wang, Peihao, Xiong, Jing, Ren, Liliang, Cheng, Hao, Kulkarni, Janardhan, Shen, Yelong, Wang, Atlas, Schwager, Mac, Schneider, Anderson, Liu, Xiaodong, Gao, Jianfeng

arXiv.org Artificial Intelligence

The attention mechanism is a core component of the Transformer architecture. Various methods have been developed to compute attention scores, including multi-head attention (MHA), multi-query attention, group-query attention and so on. We further analyze the MHA and observe that its performance improves as the number of attention heads increases, provided the hidden size per head remains sufficiently large. Therefore, increasing both the head count and hidden size per head with minimal parameter overhead can lead to significant performance gains at a low cost. Motivated by this insight, we introduce Simulated Attention Score (SAS), which maintains a compact model size while simulating a larger number of attention heads and hidden feature dimension per head. This is achieved by projecting a low-dimensional head representation into a higher-dimensional space, effectively increasing attention capacity without increasing parameter count. Beyond the head representations, we further extend the simulation approach to feature dimension of the key and query embeddings, enhancing expressiveness by mimicking the behavior of a larger model while preserving the original model size. To control the parameter cost, we also propose Parameter-Efficient Attention Aggregation (PEAA). Comprehensive experiments on a variety of datasets and tasks demonstrate the effectiveness of the proposed SAS method, achieving significant improvements over different attention variants.




Systematic Alias Sampling: an efficient and low-variance way to sample from a discrete distribution

Vallivaara, Ilari, Poikselkä, Katja, Rikula, Pauli, Röning, Juha

arXiv.org Artificial Intelligence

In this paper we combine the Alias method with the concept of systematic sampling, a method commonly used in particle filters for efficient low-variance resampling. The proposed method allows very fast sampling from a discrete distribution: drawing k samples is up to an order of magnitude faster than binary search from the cumulative distribution function (cdf) or inversion methods used in many libraries. The produced empirical distribution function is evaluated using a modified Cramér-Von Mises goodness-of-fit statistic, showing that the method compares very favourably to multinomial sampling. As continuous distributions can often be approximated with discrete ones, the proposed method can be used as a very general way to efficiently produce random samples for particle filter proposal distributions, e.g. for motion models in robotics.


Enhancing Clinical Decision-Making: Integrating Multi-Agent Systems with Ethical AI Governance

Chen, Ying-Jung, Albarqawi, Ahmad, Chen, Chi-Sheng

arXiv.org Artificial Intelligence

Abstract--Recent advances in the data-driven medicine approach, which integrates ethically managed and explainable artificial intelligence into clinical decision support systems (CDSS), are critical to ensure reliable and effective patient care. This paper focuses on comparing novel agent system designs that use modular agents to analyze laboratory results, vital signs, and clinical context, and to predict and validate results. We implement our agent system with the eICU database, including running lab analysis, vitals-only interpreters, and contextual reasoners agents first, then sharing the memory into the integration agent, prediction agent, transparency agent, and a validation agent. Our results suggest that the multi-agent system (MAS) performed better than the single-agent system (SAS) with mortality prediction accuracy (59%, 56%) and the mean error for length of stay (LOS)(4.37 days, 5.82 days), respectively. However, the transparency score for the SAS (86.21) is slightly better than the transparency score for MAS (85.5). Finally, this study suggests that our agent-based framework not only improves process transparency and prediction accuracy but also strengthens trustworthy AI-assisted decision support in an intensive care setting. Artificial intelligence (AI) has been widely adopt into healthcare [1]. Within medicine, it's proving valuable for sharpening diagnostic precision, supporting treatment planning, and helping clinicians take care of patients [2].


Towards Autonomous In-situ Soil Sampling and Mapping in Large-Scale Agricultural Environments

Nguyen, Thien Hoang, Muller, Erik, Rubin, Michael, Wang, Xiaofei, Sibona, Fiorella, McBratney, Alex, Sukkarieh, Salah

arXiv.org Artificial Intelligence

Abstract-- Traditional soil sampling and analysis methods are labor-intensive, time-consuming, and limited in spatial resolution, making them unsuitable for large-scale precision agriculture. T o address these limitations, we present a robotic solution for real-time sampling, analysis and mapping of key soil properties. Our system consists of two main sub-systems: a Sample Acquisition System (SAS) for precise, automated in-field soil sampling; and a Sample Analysis Lab (Lab) for real-time soil property analysis. The system's performance was validated through extensive field trials at a large-scale Australian farm. Experimental results show that the SAS can consistently acquire soil samples with a mass of 50g at a depth of 200mm, while the Lab can process each sample within 10 minutes to accurately measure pH and macronutrients. These results demonstrate the potential of the system to provide farmers with timely, data-driven insights for more efficient and sustainable soil management and fertilizer application. I. INTRODUCTION Achieving sustainable agricultural resource management requires accurate, high-resolution, and up-to-date data on soil properties such as pH and macronutrients [1], [2]. However, conventional soil sampling and testing methods fail to address this need at scale.


Quantifying Holistic Review: A Multi-Modal Approach to College Admissions Prediction

Zeng, Jun-Wei, Shen, Jerry

arXiv.org Artificial Intelligence

This paper introduces the Comprehensive Applicant Profile Score (CAPS), a novel multi-modal framework designed to quantitatively model and interpret holistic college admissions evaluations. CAPS decomposes applicant profiles into three interpretable components: academic performance (Standardized Academic Score, SAS), essay quality (Essay Quality Index, EQI), and extracurricular engagement (Extracurricular Impact Score, EIS). Leveraging transformer-based semantic embeddings, LLM scoring, and XGBoost regression, CAPS provides transparent and explainable evaluations aligned with human judgment. Experiments on a synthetic but realistic dataset demonstrate strong performance, achieving an EQI prediction R^2 of 0.80, classification accuracy over 75%, a macro F1 score of 0.69, and a weighted F1 score of 0.74. CAPS addresses key limitations in traditional holistic review -- particularly the opacity, inconsistency, and anxiety faced by applicants -- thus paving the way for more equitable and data-informed admissions practices.


Enabling Multi-Agent Systems as Learning Designers: Applying Learning Sciences to AI Instructional Design

Wang, Jiayi, Xiao, Ruiwei, Hou, Xinying, Stamper, John

arXiv.org Artificial Intelligence

K-12 educators are increasingly using Large Language Models (LLMs) to create instructional materials. These systems excel at producing fluent, coherent content, but often lack support for high-quality teaching. The reason is twofold: first, commercial LLMs, such as ChatGPT and Gemini which are among the most widely accessible to teachers, do not come preloaded with the depth of pedagogical theory needed to design truly effective activities; second, although sophisticated prompt engineering can bridge this gap, most teachers lack the time or expertise and find it difficult to encode such pedagogical nuance into their requests. This study shifts pedagogical expertise from the user's prompt to the LLM's internal architecture. We embed the well-established Knowledge-Learning-Instruction (KLI) framework into a Multi-Agent System (MAS) to act as a sophisticated instructional designer. We tested three systems for generating secondary Math and Science learning activities: a Single-Agent baseline simulating typical teacher prompts; a role-based MAS where agents work sequentially; and a collaborative MAS-CMD where agents co-construct activities through conquer and merge discussion. The generated materials were evaluated by 20 practicing teachers and a complementary LLM-as-a-judge system using the Quality Matters (QM) K-12 standards. While the rubric scores showed only small, often statistically insignificant differences between the systems, the qualitative feedback from educators painted a clear and compelling picture. Teachers strongly preferred the activities from the collaborative MAS-CMD, describing them as significantly more creative, contextually relevant, and classroom-ready. Our findings show that embedding pedagogical principles into LLM systems offers a scalable path for creating high-quality educational content.