arxiv:2504.03767

MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits

Published on Apr 2, 2025

· Submitted by

John Halloran on Apr 15, 2025

Upvote

Authors:

John Halloran

Abstract

The Model Context Protocol (MCP) has security vulnerabilities that can be exploited through various attacks, which are mitigated by the introduced agentic tool, MCPSafetyScanner, for assessing MCP server security.

AI-generated summary

To reduce development overhead and enable seamless integration between potential components comprising any given generative AI application, the Model Context Protocol (MCP) (Anthropic, 2024) has recently been released and subsequently widely adopted. The MCP is an open protocol that standardizes API calls to large language models (LLMs), data sources, and agentic tools. By connecting multiple MCP servers, each defined with a set of tools, resources, and prompts, users are able to define automated workflows fully driven by LLMs. However, we show that the current MCP design carries a wide range of security risks for end users. In particular, we demonstrate that industry-leading LLMs may be coerced into using MCP tools to compromise an AI developer's system through various attacks, such as malicious code execution, remote access control, and credential theft. To proactively mitigate these and related attacks, we introduce a safety auditing tool, MCPSafetyScanner, the first agentic tool to assess the security of an arbitrary MCP server. MCPScanner uses several agents to (a) automatically determine adversarial samples given an MCP server's tools and resources; (b) search for related vulnerabilities and remediations based on those samples; and (c) generate a security report detailing all findings. Our work highlights serious security issues with general-purpose agentic workflows while also providing a proactive tool to audit MCP server safety and address detected vulnerabilities before deployment. The described MCP server auditing tool, MCPSafetyScanner, is freely available at: https://github.com/johnhalloran321/mcpSafetyScanner

View arXiv page View PDF GitHub 173 Add to collection

Community

johnhalloran

Paper author Paper submitter Apr 15, 2025

Abstract

To reduce development overhead and enable seamless integration between potential components comprising any given generative AI application, the Model Context Protocol (MCP) (Anthropic, 2025d) has recently been released and, subsequently, widely adapted. The MCP is an open protocol which standardizes API calls to large language models (LLMs), data sources, and agentic tools. Thus, by connecting multiple MCP servers–each defined with a set of tools, resources, and prompts–users are able to define automated workflows fully driven by LLMs. However, we show that the current MCP design carries a wide range of security risks for end-users. In particular, we show that industry-leading LLMs may be coerced to use MCP tools and compromise an AI developer’s system through a wide range of attacks, e.g., malicious code execution, remote access control, and credential theft. In order to proactively mitigate the demonstrated (and related) attacks, we introduce a safety auditing tool, McpSafetyScanner, the first such agentic tool to assess the security of an arbitrary MCP server. McpSafetyScanner uses several agents to: a) automatically determine adversarial samples given an MCP server’s tools and resources, (b) search for related vulnerabilities and remediations given such samples, and (c) generate a security report detailing all findings. Our work thus sheds light on serious security issues with general purpose agentic workflows, while also providing a proactive tool to audit the safety of MCP servers and address detected vulnerabilities prior to deployment.

The described MCP server auditing tool, MCPSafetyScanner, is freely available at: https://github.com/johnhalloran321/mcpSafetyScanner.

librarian-bot

Apr 17, 2025

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

armorerlabs

14 days ago

This is a useful framing for MCP security. One runtime angle I would add alongside server auditing is a per-invocation gate at the client/agent boundary.

Even if an MCP server passes an audit, the dangerous moment is often the specific tool call the agent is about to execute: tool name, arguments, source text that influenced it, and whether data is about to leave the trust boundary. Scanning that step with context catches a different class of failures than static server inspection.

We are experimenting with this in Armorer Guard as a fast local pre-tool-call classifier: retrieved/tool-output text is treated differently from actual tool-call args and outbound payloads. The goal is to make it cheap enough to run before every MCP/tool invocation, not only during periodic audits: https://huggingface.co/armorer-labs/armorer-guard-semantic-classifier

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2504.03767

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.03767 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.03767 in a dataset README.md to link it from this page.

MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits

Abstract

Community

Abstract

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 1

Collections including this paper 10