tool description
MPMA: Preference Manipulation Attack Against Model Context Protocol
Wang, Zihan, Zhang, Rui, Liu, Yu, Fan, Wenshu, Jiang, Wenbo, Zhao, Qingchuan, Li, Hongwei, Xu, Guowen
Model Context Protocol (MCP) standardizes interface mapping for large language models (LLMs) to access external data and tools, which revolutionizes the paradigm of tool selection and facilitates the rapid expansion of the LLM agent tool ecosystem. However, as the MCP is increasingly adopted, third-party customized versions of the MCP server expose potential security vulnerabilities. In this paper, we first introduce a novel security threat, which we term the MCP Preference Manipulation Attack (MPMA). An attacker deploys a customized MCP server to manipulate LLMs, causing them to prioritize it over other competing MCP servers. This can result in economic benefits for attackers, such as revenue from paid MCP services or advertising income generated from free servers. To achieve MPMA, we first design a Direct Preference Manipulation Attack (DPMA) that achieves significant effectiveness by inserting the manipulative word and phrases into the tool name and description. However, such a direct modification is obvious to users and lacks stealthiness. To address these limitations, we further propose Genetic-based Advertising Preference Manipulation Attack (GAPMA). GAPMA employs four commonly used strategies to initialize descriptions and integrates a Genetic Algorithm (GA) to enhance stealthiness. The experiment results demonstrate that GAPMA balances high effectiveness and stealthiness. Our study reveals a critical vulnerability of the MCP in open ecosystems, highlighting an urgent need for robust defense mechanisms to ensure the fairness of the MCP ecosystem.
Delegated Authorization for Agents Constrained to Semantic Task-to-Scope Matching
Helou, Majed El, Troiani, Chiara, Ryder, Benjamin, Diaconu, Jean, Muyal, Hervé, Yannuzzi, Marcelo
Authorizing Large Language Model driven agents to dynamically invoke tools and access protected resources introduces significant risks, since current methods for delegating authorization grant overly broad permissions and give access to tools allowing agents to operate beyond the intended task scope. We introduce and assess a delegated authorization model enabling authorization servers to semantically inspect access requests to protected resources, and issue access tokens constrained to the minimal set of scopes necessary for the agents' assigned tasks. Given the unavailability of datasets centered on delegated authorization flows, particularly including both semantically appropriate and inappropriate scope requests for a given task, we introduce ASTRA, a dataset and data generation pipeline for benchmarking semantic matching between tasks and scopes. Our experiments show both the potential and current limitations of model-based matching, particularly as the number of scopes needed for task completion increases. Our results highlight the need for further research into semantic matching techniques enabling intent-aware authorization for multi-agent and tool-augmented applications, including fine-grained control, such as Task-Based Access Control (TBAC).
- Africa > Tanzania > Zanzibar (0.04)
- Africa > Tanzania > Mjini Magharibi Region > Zanzibar (0.04)
- Europe > Switzerland (0.04)
ToolDreamer: Instilling LLM Reasoning Into Tool Retrievers
Sengupta, Saptarshi, Zhou, Zhengyu, Araki, Jun, Wang, Xingbo, Wang, Bingqing, Wang, Suhang, Feng, Zhe
Tool calling has become increasingly popular for Large Language Models (LLMs). However, for large tool sets, the resulting tokens would exceed the LLM's context window limit, making it impossible to include every tool. Hence, an external retriever is used to provide LLMs with the most relevant tools for a query. Existing retrieval models rank tools based on the similarity between a user query and a tool description (TD). This leads to suboptimal retrieval as user requests are often poorly aligned with the language of TD. To remedy the issue, we propose ToolDreamer, a framework to condition retriever models to fetch tools based on hypothetical (synthetic) TD generated using an LLM, i.e., description of tools that the LLM feels will be potentially useful for the query. The framework enables a more natural alignment between queries and tools within the language space of TD's. We apply ToolDreamer on the ToolRet dataset and show that our method improves the performance of sparse and dense retrievers with and without training, thus showcasing its flexibility. Through our proposed framework, our aim is to offload a portion of the reasoning burden to the retriever so that the LLM may effectively handle a large collection of tools without inundating its context window.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (8 more...)
Toward Understanding Security Issues in the Model Context Protocol Ecosystem
The Model Context Protocol (MCP) is an emerging open standard that enables AI-powered applications to interact with external tools through structured metadata. A rapidly growing ecosystem has formed around MCP, including a wide range of MCP hosts (i.e., Cursor, Windsurf, Claude Desktop, and Cline), MCP registries (i.e., mcp.so, MCP Market, MCP Store, Pulse MCP, Smithery, and npm), and thousands of community-contributed MCP servers. Although the MCP ecosystem is gaining traction, there has been little systematic study of its architecture and associated security risks. In this paper, we present the first comprehensive security analysis of the MCP ecosystem. We decompose MCP ecosystem into three core components: hosts, registries, and servers, and study the interactions and trust relationships among them. Users search for servers on registries and configure them in the host, which translates LLM-generated output into external tool invocations provided by the servers and executes them. Our qualitative analysis reveals that hosts lack output verification mechanisms for LLM-generated outputs, enabling malicious servers to manipulate model behavior and induce a variety of security threats, including but not limited to sensitive data exfiltration. We uncover a wide range of vulnerabilities that enable attackers to hijack servers, due to the lack of a vetted server submission process in registries. To support our analysis, we collect and analyze a dataset of 67,057 servers from six public registries. Our quantitative analysis demonstrates that a substantial number of servers can be hijacked by attackers. Finally, we propose practical defense strategies for MCP hosts, registries, and users. We responsibly disclosed our findings to affected hosts and registries.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- Asia (0.04)
TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use
He, Pengfei, Dai, Zhenwei, He, Bing, Liu, Hui, Tang, Xianfeng, Lu, Hanqing, Li, Juanhui, Ding, Jiayuan, Mukherjee, Subhabrata, Wang, Suhang, Xing, Yue, Tang, Jiliang, Dumoulin, Benoit
Large language model (LLM)-based agents increasingly rely on tool use to complete real-world tasks. While existing works evaluate the LLMs' tool use capability, they largely focus on the final answers yet overlook the detailed tool usage trajectory, i.e., whether tools are selected, parameterized, and ordered correctly. We introduce TRAJECT-Bench, a trajectory-aware benchmark to comprehensively evaluate LLMs' tool use capability through diverse tasks with fine-grained evaluation metrics. TRAJECT-Bench pairs high-fidelity, executable tools across practical domains with tasks grounded in production-style APIs, and synthesizes trajectories that vary in breadth (parallel calls) and depth (interdependent chains). Besides final accuracy, TRAJECT-Bench also reports trajectory-level diagnostics, including tool selection and argument correctness, and dependency/order satisfaction. Analyses reveal failure modes such as similar tool confusion and parameter-blind selection, and scaling behavior with tool diversity and trajectory length where the bottleneck of transiting from short to mid-length trajectories is revealed, offering actionable guidance for LLMs' tool use.
- Europe > Austria > Vienna (0.14)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
- Europe > France (0.04)
- (35 more...)
- Leisure & Entertainment (1.00)
- Consumer Products & Services > Travel (1.00)
- Media > Music (0.96)
- (2 more...)
ToolTweak: An Attack on Tool Selection in LLM-based Agents
Sneh, Jonathan, Yan, Ruomei, Yu, Jialin, Torr, Philip, Gal, Yarin, Sengupta, Sunando, Sommerlade, Eric, Paren, Alasdair, Bibi, Adel
As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities. These agents typically select tools from growing databases or marketplaces to solve user tasks, creating implicit competition among tool providers and developers for visibility and usage. In this paper, we show that this selection process harbors a critical vulnerability: by iteratively manipulating tool names and descriptions, adversaries can systematically bias agents toward selecting specific tools, gaining unfair advantage over equally capable alternatives. We present ToolTweak, a lightweight automatic attack that increases selection rates from a baseline of around 20% to as high as 81%, with strong transferability between open-source and closed-source models. Beyond individual tools, we show that such attacks cause distributional shifts in tool usage, revealing risks to fairness, competition, and security in emerging tool ecosystems. To mitigate these risks, we evaluate two defenses: paraphrasing and perplexity filtering, which reduce bias and lead agents to select functionally similar tools more equally. All code will be open-sourced upon acceptance.
- Information Technology > Security & Privacy (0.68)
- Government > Military (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
A Study on the MCP x A2A Framework for Enhancing Interoperability of LLM-based Autonomous Agents
This paper provides an in-depth technical analysis and implementation methodology of the open-source Agent-to-Agent (A2A) protocol developed by Google and the Model Context Protocol (MCP) introduced by Anthropic. While the evolution of LLM-based autonomous agents is rapidly accelerating, efficient interactions among these agents and their integration with external systems remain significant challenges. In modern AI systems, collaboration between autonomous agents and integration with external tools have become essential elements for building practical AI applications. A2A offers a standardized communication method that enables agents developed in heterogeneous environments to collaborate effectively, while MCP provides a structured I/O framework for agents to connect with external tools and resources. Prior studies have focused primarily on the features and applications of either A2A or MCP individually. In contrast, this study takes an integrated approach, exploring how the two protocols can complement each other to address interoperability issues and facilitate efficient collaboration within complex agent ecosystems.
- North America > United States (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Research Report (1.00)
- Workflow (0.69)
- Information Technology > Security & Privacy (1.00)
- Banking & Finance (1.00)
Automatic Red Teaming LLM-based Agents with Model Context Protocol Tools
He, Ping, Li, Changjiang, Zhao, Binbin, Du, Tianyu, Ji, Shouling
Abstract--The remarkable capability of large language models (LLMs) has led to the wide application of LLM-based agents in various domains. T o standardize interactions between LLMbased agents and their environments, model context protocol (MCP) tools have become the de facto standard and are now widely integrated into these agents. However, the incorporation of MCP tools introduces the risk of tool poisoning attacks, which can manipulate the behavior of LLM-based agents. Although previous studies have identified such vulnerabilities, their red teaming approaches have largely remained at the proof-of-concept stage, leaving the automatic and systematic red teaming of LLMbased agents under the MCP tool poisoning paradigm an open question. T o bridge this gap, we propose AutoMalTool, an automated red teaming framework for LLM-based agents by generating malicious MCP tools. Our extensive evaluation shows that AutoMalTool effectively generates malicious MCP tools capable of manipulating the behavior of mainstream LLM-based agents while evading current detection mechanisms, thereby revealing new security risks in these agents. I. Introduction The recent advancements in large language models (LLMs) have facilitated the rapid development of LLM-based agents capable of executing complex tasks across a wide range of domains, e.g., finance [1]-[3], software development [4], [5], scientific research [6], [7], etc. Within these agents, tools play a crucial role in enhancing problem-solving capabilities by enabling interaction with external resources and facilitating actions beyond the language token generation [8]. Nevertheless, tool usage among LLM-based agents remains fragmented due to the diversity of operational environments and varying tool usage patterns. T o address this challenge, the Model Context Protocol (MCP) [9] has been proposed and has emerged as the de facto standard for standardizing interactions between LLM-based agents and external resources. The MCP server delivers context to LLM-based agents, enabling them to access relevant information and tools in a unified manner. Ping He is with the College of Computer Science and T echnology, Zhejiang University (e-mail: gnip@zju.edu.cn). Changjiang Li is with Palo Alto Networks (e-mail: meet.cjli@gmail.com). Shouling Ji is with the College of Computer Science and T echnology, Zhejiang University (e-mail: sji@zju.edu.cn). In a tool poisoning attack, the adversary injects malicious instructions, commonly through prompt injection, into the metadata of MCP tools, such as their descriptions, thereby generating malicious MCP tools. LLM-based agent developers may inadvertently install these malicious packages, thereby altering agent behaviors and resulting in an open-source software supply chain poisoning attack [15].
- Information Technology > Security & Privacy (1.00)
- Banking & Finance > Trading (0.93)
Tool Preferences in Agentic LLMs are Unreliable
Faghih, Kazem, Wang, Wenxiao, Cheng, Yize, Bharti, Siddhant, Sriramanan, Gaurang, Balasubramanian, Sriram, Hosseini, Parsa, Feizi, Soheil
Large language models (LLMs) can now access a wide range of external tools, thanks to the Model Context Protocol (MCP). This greatly expands their abilities as various agents. However, LLMs rely entirely on the text descriptions of tools to decide which ones to use--a process that is surprisingly fragile. In this work, we expose a vulnerability in prevalent tool/function-calling protocols by investigating a series of edits to tool descriptions, some of which can drastically increase a tool's usage from LLMs when competing with alternatives. Through controlled experiments, we show that tools with properly edited descriptions receive over 10 times more usage from GPT-4.1 and Qwen2.5-7B than tools with original descriptions. We further evaluate how various edits to tool descriptions perform when competing directly with one another and how these trends generalize or differ across a broader set of 17 different models. These phenomena, while giving developers a powerful way to promote their tools, underscore the need for a more reliable foundation for agentic LLMs to select and utilize tools and resources. Our code is publicly available at https://github.com/kazemf78/llm-unreliable-tool-preferences.
- North America > United States > Maryland (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Automated Creation and Enrichment Framework for Improved Invocation of Enterprise APIs as Tools
Agarwal, Prerna, Gupta, Himanshu, Soni, Soujanya, Vallam, Rohith, Sindhgatta, Renuka, Mehta, Sameep
Recent advancements in Large Language Models (LLMs) has lead to the development of agents capable of complex reasoning and interaction with external tools. In enterprise contexts, the effective use of such tools that are often enabled by application programming interfaces (APIs), is hindered by poor documentation, complex input or output schema, and large number of operations. These challenges make tool selection difficult and reduce the accuracy of payload formation by up to 25%. We propose ACE, an automated tool creation and enrichment framework that transforms enterprise APIs into LLM-compatible tools. ACE, (i) generates enriched tool specifications with parameter descriptions and examples to improve selection and invocation accuracy, and (ii) incorporates a dynamic shortlisting mechanism that filters relevant tools at runtime, reducing prompt complexity while maintaining scalability. We validate our framework on both proprietary and open-source APIs and demonstrate its integration with agentic frameworks. To the best of our knowledge, ACE is the first end-to-end framework that automates the creation, enrichment, and dynamic selection of enterprise API tools for LLM agents.
- North America > United States (0.04)
- Europe > Norway > Norwegian Sea (0.04)
- Asia > Singapore (0.04)
- (2 more...)