Not enough data to create a plot.
Try a different view from the menu above.
Wang, Haoyu
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions
Hou, Xinyi, Zhao, Yanjie, Wang, Shenao, Wang, Haoyu
The Model Context Protocol (MCP) is a standardized interface designed to enable seamless interaction between AI models and external tools and resources, breaking down data silos and facilitating interoperability across diverse systems. This paper provides a comprehensive overview of MCP, focusing on its core components, workflow, and the lifecycle of MCP servers, which consists of three key phases: creation, operation, and update. We analyze the security and privacy risks associated with each phase and propose strategies to mitigate potential threats. The paper also examines the current MCP landscape, including its adoption by industry leaders and various use cases, as well as the tools and platforms supporting its integration. We explore future directions for MCP, highlighting the challenges and opportunities that will influence its adoption and evolution within the broader AI ecosystem. Finally, we offer recommendations for MCP stakeholders to ensure its secure and sustainable development as the AI landscape continues to evolve.
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
Wang, Haoyu, Poskitt, Christopher M., Sun, Jun
Agents built on LLMs are increasingly deployed across diverse domains, automating complex decision-making and task execution. However, their autonomy introduces safety risks, including security vulnerabilities, legal violations, and unintended harmful actions. Existing mitigation methods, such as model-based safeguards and early enforcement strategies, fall short in robustness, interpretability, and adaptability. To address these challenges, we propose AgentSpec, a lightweight domain-specific language for specifying and enforcing runtime constraints on LLM agents. With AgentSpec, users define structured rules that incorporate triggers, predicates, and enforcement mechanisms, ensuring agents operate within predefined safety boundaries. We implement AgentSpec across multiple domains, including code execution, embodied agents, and autonomous driving, demonstrating its adaptability and effectiveness. Our evaluation shows that AgentSpec successfully prevents unsafe executions in over 90% of code agent cases, eliminates all hazardous actions in embodied agent tasks, and enforces 100% compliance by autonomous vehicles (AVs). Despite its strong safety guarantees, AgentSpec remains computationally lightweight, with overheads in milliseconds. By combining interpretability, modularity, and efficiency, AgentSpec provides a practical and scalable solution for enforcing LLM agent safety across diverse applications. We also automate the generation of rules using LLMs and assess their effectiveness. Our evaluation shows that the rules generated by OpenAI o1 achieve a precision of 95.56% and recall of 70.96% for embodied agents, successfully identifying 87.26% of the risky code, and prevent AVs from breaking laws in 5 out of 8 scenarios.
Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents
Wang, Haoyu, Dai, Sunhao, Zhao, Haiyuan, Pang, Liang, Zhang, Xiao, Wang, Gang, Dong, Zhenhua, Xu, Jun, Wen, Ji-Rong
Previous studies have found that PLM-based retrieval models exhibit a preference for LLM-generated content, assigning higher relevance scores to these documents even when their semantic quality is comparable to human-written ones. This phenomenon, known as source bias, threatens the sustainable development of the information access ecosystem. However, the underlying causes of source bias remain unexplored. In this paper, we explain the process of information retrieval with a causal graph and discover that PLM-based retrievers learn perplexity features for relevance estimation, causing source bias by ranking the documents with low perplexity higher. Theoretical analysis further reveals that the phenomenon stems from the positive correlation between the gradients of the loss functions in language modeling task and retrieval task. Based on the analysis, a causal-inspired inferencetime debiasing method is proposed, called Causal Diagnosis and Correction (CDC). CDC first diagnoses the bias effect of the perplexity and then separates the bias effect from the overall estimated relevance score. The rapid advancement of large language models (LLMs) has driven a significant increase in AIgenerated content (AIGC), leading to information retrieval (IR) systems that now index both humanwritten and LLM-generated contents (Cao et al., 2023; Dai et al., 2024b; 2025). However, recent studies (Dai et al., 2024a;c; Xu et al., 2024) have uncovered that Pretrained Language Model (PLM) based retrievers (Guo et al., 2022; Zhao et al., 2024) exhibit preferences for LLM-generated documents, ranking them higher even when their semantic quality is comparable to human-written content. This phenomenon, referred to as source bias, is prevalent among various popular PLM-based retrievers across different domains (Dai et al., 2024a). If the problem is not resolved promptly, human authors' creative willingness will be severely reduced, and the existing content ecosystem may collapse.
The Next Frontier of LLM Applications: Open Ecosystems and Hardware Synergy
Hou, Xinyi, Zhao, Yanjie, Wang, Haoyu
The second paradigm involves LLM agents developed using frameworks like LangChain [16], AutoGPT [11], Langroid [18], AutoGen [23], and LlamaIndex [22], which offer greater programmability and modularity, allowing developers to build sophisticated, multi-agent systems that integrate external tools and dynamic workflows [20]. Despite their advantages, both paradigms remain architecturally fragmented and lack standardized interoperability, leading to redundant development efforts and constrained scalability. From a software engineering (SE) perspective, current LLM application paradigms resemble traditional platform-centric software ecosystems, where applications are tightly coupled to proprietary APIs and execution environments. LLM app stores, while lowering the barrier to entry, impose constraints on extensibility and cross-platform interoperability, leading to vendor lock-in and duplicated development efforts across different ecosystems. In contrast, agent-based LLM frameworks provide modularity but lack standardized mechanisms for component reuse and integration, making it challenging to compose LLM applications that seamlessly operate across heterogeneous environments. This fragmentation mirrors historical challenges in SE, where monolithic architectures have given way to service-oriented and microservices-based designs to improve reusability, scalability, and maintainability. Another key limitation of existing LLM applications is inefficient hardware utilization.
LLMs Can Generate a Better Answer by Aggregating Their Own Responses
Li, Zichong, Feng, Xinyu, Cai, Yuheng, Zhang, Zixuan, Liu, Tianyi, Liang, Chen, Chen, Weizhu, Wang, Haoyu, Zhao, Tuo
Large Language Models (LLMs) have shown remarkable capabilities across tasks, yet they often require additional prompting techniques when facing complex problems. While approaches like self-correction and response selection have emerged as popular solutions, recent studies have shown these methods perform poorly when relying on the LLM itself to provide feedback or selection criteria. We argue this limitation stems from the fact that common LLM post-training procedures lack explicit supervision for discriminative judgment tasks. In this paper, we propose Generative Self-Aggregation (GSA), a novel prompting method that improves answer quality without requiring the model's discriminative capabilities. GSA first samples multiple diverse responses from the LLM, then aggregates them to obtain an improved solution. Unlike previous approaches, our method does not require the LLM to correct errors or compare response quality; instead, it leverages the model's generative abilities to synthesize a new response based on the context of multiple samples. While GSA shares similarities with the self-consistency (SC) approach for response aggregation, SC requires specific verifiable tokens to enable majority voting. In contrast, our approach is more general and can be applied to open-ended tasks. Empirical evaluation demonstrates that GSA effectively improves response quality across various tasks, including mathematical reasoning, knowledge-based problems, and open-ended generation tasks such as code synthesis and conversational responses.
Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning
Liu, Tianci, Li, Ruirui, Qi, Yunzhe, Liu, Hui, Tang, Xianfeng, Zheng, Tianqi, Yin, Qingyu, Cheng, Monica Xiao, Huan, Jun, Wang, Haoyu, Gao, Jing
Large language models (LLMs) have achieved remarkable performance on various natural language tasks. However, they are trained on static corpora and their knowledge can become outdated quickly in the fast-changing world. This motivates the development of knowledge editing methods designed to update certain knowledge in LLMs without changing unrelated others. To make selective edits, previous efforts often sought to update a small amount of parameters in some specific layer(s) of a LLM. Nonetheless, in challenging scenarios, they still fall short in making successful edits while preserving knowledge irrelevant to the updates simultaneously, resulting in a notable editing-locality trade-off. In this work, we question if the trade-offs are caused by the fact that parameter-based updates have a global effect, i.e., edited parameters affect all inputs indiscriminately. In light of this, we explore the feasibility of representation fine-tuning, which applied some linear update to a few representations in a learned subspace, for knowledge editing. While being effective to enhance an LLM's general ability as demonstrated in the previous work, we theoretically show that this linear update imposes a tension in editing-locality trade-off. Subsequently, BaFT is proposed to break the linearity. BaFT computes a weight for each basis that spans a dimension of the subspace based on the input representation. This input-dependent weighting mechanism allows BaFT to manage different types of knowledge in an adaptive way, thereby achieving a better editing-locality trade-off. Experiments on three LLMs with five editing benchmarks in diverse scenarios show the superiority of our method.
Poster: Long PHP webshell files detection based on sliding window attention
Wang, Zhiqiang, Wang, Haoyu, Hao, Lu
Webshell is a type of backdoor, and web applications are widely exposed to webshell injection attacks. Therefore, it is important to study webshell detection techniques. In this study, we propose a webshell detection method. We first convert PHP source code to opcodes and then extract Opcode Double-Tuples (ODTs). Next, we combine CodeBert and FastText models for feature representation and classification. To address the challenge that deep learning methods have difficulty detecting long webshell files, we introduce a sliding window attention mechanism. This approach effectively captures malicious behavior within long files. Experimental results show that our method reaches high accuracy in webshell detection, solving the problem of traditional methods that struggle to address new webshell variants and anti-detection techniques.
Bridging Information Gaps with Comprehensive Answers: Improving the Diversity and Informativeness of Follow-Up Questions
Liu, Zhe, Kang, Taekyu, Wang, Haoyu, Alavi, Seyed Hossein, Shwartz, Vered
Effective conversational systems are expected to dynamically generate contextual follow-up questions to elicit new information while maintaining the conversation flow. While humans excel at asking diverse and informative questions by intuitively assessing both obtained and missing information, existing models often fall short of human performance on this task. To mitigate this, we propose a method that generates diverse and informative questions based on targeting unanswered information using a hypothetical LLM-generated "comprehensive answer". Our method is applied to augment an existing follow-up questions dataset. The experimental results demonstrate that language models fine-tuned on the augmented datasets produce follow-up questions of significantly higher quality and diversity. This promising approach could be effectively adopted to future work to augment information-seeking dialogues for reducing ambiguities and improving the accuracy of LLM answers.
Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression
Wang, Haoyu, Teng, Tong, Guo, Tianyu, Xiao, An, Tang, Duyu, Chen, Hanting, Wang, Yunhe
Handling long-context sequences efficiently remains a significant challenge in large language models (LLMs). Existing methods for token selection in sequence extrapolation either employ a permanent eviction strategy or select tokens by chunk, which may lead to the loss of critical information. We propose Efficient Selective Attention (ESA), a novel approach that extends context length by efficiently selecting the most critical tokens at the token level to compute attention. ESA reduces the computational complexity of token selection by compressing query and key vectors into lower-dimensional representations. We evaluate ESA on long sequence benchmarks with maximum lengths up to 256k using open-source LLMs with context lengths of 8k and 32k. ESA outperforms other selective attention methods, especially in tasks requiring the retrieval of multiple pieces of information, achieving comparable performance to full-attention extrapolation methods across various tasks, with superior results in certain tasks.
Detecting LLM Fact-conflicting Hallucinations Enhanced by Temporal-logic-based Reasoning
Li, Ningke, Song, Yahui, Wang, Kailong, Li, Yuekang, Shi, Ling, Liu, Yi, Wang, Haoyu
Abstract--Large language models (LLMs) face the challenge of hallucinations - outputs that seem coherent but are actually incorrect. A particularly damaging type is fact-conflicting hallucination (FCH), where generated content contradicts established facts. Addressing FCH presents three main challenges: 1) Automatically constructing and maintaining large-scale benchmark datasets is difficult and resource-intensive; 2) Generating complex and efficient test cases that the LLM has not been trained on - especially those involving intricate temporal features - is challenging, yet crucial for eliciting hallucinations; and 3) Validating the reasoning behind LLM outputs is inherently difficult, particularly with complex logical relationships, as it requires transparency in the model's decision-making process. LLMs are tested using these cases through template-based prompts, which require them to generate both answers and reasoning steps. T o validate the reasoning, we propose two semantic-aware oracles that compare the semantic structure of LLM outputs to the ground truths. Key insights reveal that LLMs struggle with out-of-distribution knowledge and logical reasoning. These findings highlight the importance of continued efforts to detect and mitigate hallucinations in LLMs. Large Language Models (LLMs) have revolutionized language processing, demonstrating impressive text generation and comprehension capabilities with diverse applications. However, despite their growing use, LLMs face significant security and privacy challenges [1], [2], [3], [4], [5], which affect their overall effectiveness and reliability . A critical issue is the phenomenon of hallucination, where LLMs generate outputs that are coherent but factually incorrect or irrelevant. This tendency to produce misleading information compromises the safety and usability of LLM-based systems. This paper focuses on fact-conflicting hallucina tion (FCH), the most prominent form of hallucination in LLMs. FCH occurs when LLMs generate content that directly contradicts established facts. For instance, as illustrated in Figure 1, an LLM incorrectly asserts that " Haruki Murakami won the Nobel Prize in Literature in 2016 ", whereas the fact is that "Haruki Murakami has not won the Nobel Prize, though he has received numerous other literary awards ". Such inaccuracies can significantly lead to user confusion and undermine the trust and reliability that are crucial for LLM applications. N. Li, K. Wang, and H. Wang are with Huazhong University of Science and T echnology, China. Song is with the National University of Singapore, Singapore. Li is with the University of New South Wales, Australia.