MCP Security

Definition

MCP Security covers the controls required to safely operate AI agents that use the Model Context Protocol (MCP) — Anthropic’s open standard for giving LLM-based agents access to external tools, data sources, and services. As MCP adoption scales across enterprise estates, securing MCP servers and the traffic flowing through them becomes a distinct security domain.

Aliases / Variants

MCP proxy security
Agent integration security
A2A (Agent2Agent) security — the complementary Google-origin protocol for agent-to-agent communication

Threat Surface

MCP introduces several novel attack surfaces beyond traditional API security:

Threat	Description
Rogue MCP server	A malicious or compromised MCP server that injects instructions or data into an agent’s context, causing goal manipulation or data exfiltration
Agent Account Takeover (AATO)	An attacker hijacks an agent’s session or identity to use it as a proxy for malicious actions — analogous to ATTO (Account Takeover) for human accounts
Prompt injection via MCP	MCP server returns a response containing injected instructions that override the agent’s intended behavior
MCP scope creep	Agents granted overly broad MCP permissions over time; no governance equivalent to OAuth scope auditing
Data exfiltration through MCP	An agent (or compromised agent) sends sensitive data out via MCP calls to an attacker-controlled server

Required Controls

1. MCP Proxy Layer

An MCP proxy sits between the agent and MCP servers, providing:

Traffic inspection for AI-generated and AI-bound payloads (semantic context, not just protocol)
Allow-listing of approved MCP servers and endpoints
Policy enforcement (block calls to unapproved servers)
Logging for forensics and compliance

2. Behavioral Monitoring

Because MCP traffic is semantically rich (natural language + structured calls), monitoring requires agent-specific intelligence overlaid on network/API monitoring. Standard WAF or NGFW rules are insufficient.

3. Threat Intelligence for Agent Traffic

As internet traffic increasingly originates from bots and agents, identifying malicious-agent traffic requires:

Enrichment with agent/bot-specific threat intelligence feeds
Detection of AATO patterns (session anomalies, unusual call patterns)
Filtering of traffic from known-bad MCP servers

4. MCP Governance at Scale

Enterprise estates will accumulate many MCP server registrations. Required capabilities:

Inventory and discovery of all MCP servers in use
Per-server allow/deny policies
Periodic permission reviews (analogous to OAuth token auditing)
Incident response playbooks for rogue MCP server scenarios

Relationship to Existing Controls

MCP security is an extension of — but not a replacement for — API security, network monitoring, and AI firewall capabilities. The key differentiation is understanding agent intent: whether a particular MCP call is within the normal operating envelope of the agent, or represents anomalous or malicious behavior.

Production Detection: Sensor over Gateway

The proxy-layer control above is the gateway pattern. A counter-position has now shipped at scale: the ADR system deployed at Uber (ten months, 7,200+ hosts) explicitly evaluated and rejected an LLM/MCP gateway for observability, on the grounds that a gateway requires host changes, breaks on streaming responses, and omits environmental context. ADR instead reconstructs MCP activity from an endpoint sensor that parses the local caches of coding agents, capturing the full prompt → reasoning → tool-call → outcome chain.¹ Its two-tier detector then reasons over MCP context — querying tool source code, threat intelligence, and policy over dedicated MCP providers — to close the agent-intent gap this page names. The architectural fork is treated in depth in Inline Gateway vs Runtime Instrumentation; the practical reading is that a gateway is one valid PEP but not the only path to MCP observability.

OWASP Agentic AI Threats and Mitigations names this surface as Insecure Inter-Agent Protocol Abuse (T16): flaws in MCP and A2A such as consent-flow manipulation, MCP response injection, and tool-description exploitation. It is the first OWASP document to treat protocol-level abuse of MCP and A2A as a distinct threat rather than a general prompt-injection variant, and its tool-execution and authentication playbooks call for message authentication on inter-agent channels and signed agent cards.

Standards gap

As of Securing the Autonomous Future (Oct 2025), no established standard or framework specifically governed MCP security. OWASP’s Agentic AI guidance covers the general threat class; MCP-specific controls were being built by startups ahead of incumbent tools. By 2026 the gap is narrowing from the operations side rather than the standards side — production detection frameworks (ADR) and an MCP-native benchmark (ADR-Bench) now exist, but no normative MCP-security standard has landed.

Where It Appears

Securing the Autonomous Future: Trust, Safety, and Reliability of Agentic AI — primary source; introduces MCP proxy, AATO, and governance requirements
ADR — Agentic Detection for Enterprise AI — production MCP detection-and-response at Uber; sensor-over-gateway observability; ADR-Bench MCP-native benchmark
AI Agent Identity Architecture — agent identity is exercised at the MCP boundary
Agent Observability — MCP traffic is part of the full-stack observability requirement

AI Agent Identity Architecture — agents authenticate at MCP boundaries; SPIFFE/NHI applies
Agent Observability — MCP call logs feed behavioral monitoring
Non-Human Identity (NHI) — MCP access credentials must be governed as NHIs
AgentShield — open-source static scanner with 23 MCP-specific rules (high-risk server types, npx -y typosquat surface, hardcoded secrets in env config, remote SSE/HTTP transports, shell metacharacters in args, sensitive-file args, 0.0.0.0 binding, missing timeouts, autoApprove) and a --supply-chain[-online] provenance mode; provenance-aware runtimeConfidence separates mcp.json / .claude/mcp.json / .claude.json (active runtime) from mcp-configs/ / config/mcp/ (template-example catalogs)

Sources

MCP Security

§3.1 Observability: The ADR Sensor, arXiv:2605.17380: the rejected LLM/MCP gateway alternative (host changes, streaming incompatibility, partial context) and the cache-parsing sensor that reconstructs the full causal chain. ↩

Enterprise Security in the Agentic AI Era

Explorer

MCP Security

MCP Security

Definition

Aliases / Variants

Threat Surface

Required Controls

1. MCP Proxy Layer

2. Behavioral Monitoring

3. Threat Intelligence for Agent Traffic

4. MCP Governance at Scale

Relationship to Existing Controls

Production Detection: Sensor over Gateway

Where It Appears

Sources

Graph View

Table of Contents

Backlinks

Enterprise Security in the Agentic AI Era

Explorer

MCP Security

MCP Security

Definition

Aliases / Variants

Threat Surface

Required Controls

1. MCP Proxy Layer

2. Behavioral Monitoring

3. Threat Intelligence for Agent Traffic

4. MCP Governance at Scale

Relationship to Existing Controls

Production Detection: Sensor over Gateway

Where It Appears

Related Concepts

Sources

Footnotes

Graph View

Table of Contents

Backlinks