MCP Security
Definition
MCP Security covers the controls required to safely operate AI agents that use the Model Context Protocol (MCP) — Anthropic’s open standard for giving LLM-based agents access to external tools, data sources, and services. As MCP adoption scales across enterprise estates, securing MCP servers and the traffic flowing through them becomes a distinct security domain.
Aliases / Variants
- MCP proxy security
- Agent integration security
- A2A (Agent2Agent) security — the complementary Google-origin protocol for agent-to-agent communication
Threat Surface
MCP introduces several novel attack surfaces beyond traditional API security:
| Threat | Description |
|---|---|
| Rogue MCP server | A malicious or compromised MCP server that injects instructions or data into an agent’s context, causing goal manipulation or data exfiltration |
| Agent Account Takeover (AATO) | An attacker hijacks an agent’s session or identity to use it as a proxy for malicious actions — analogous to ATTO (Account Takeover) for human accounts |
| Prompt injection via MCP | MCP server returns a response containing injected instructions that override the agent’s intended behavior |
| MCP scope creep | Agents granted overly broad MCP permissions over time; no governance equivalent to OAuth scope auditing |
| Data exfiltration through MCP | An agent (or compromised agent) sends sensitive data out via MCP calls to an attacker-controlled server |
Required Controls
1. MCP Proxy Layer
An MCP proxy sits between the agent and MCP servers, providing:
- Traffic inspection for AI-generated and AI-bound payloads (semantic context, not just protocol)
- Allow-listing of approved MCP servers and endpoints
- Policy enforcement (block calls to unapproved servers)
- Logging for forensics and compliance
2. Behavioral Monitoring
Because MCP traffic is semantically rich (natural language + structured calls), monitoring requires agent-specific intelligence overlaid on network/API monitoring. Standard WAF or NGFW rules are insufficient.
3. Threat Intelligence for Agent Traffic
As internet traffic increasingly originates from bots and agents, identifying malicious-agent traffic requires:
- Enrichment with agent/bot-specific threat intelligence feeds
- Detection of AATO patterns (session anomalies, unusual call patterns)
- Filtering of traffic from known-bad MCP servers
4. MCP Governance at Scale
Enterprise estates will accumulate many MCP server registrations. Required capabilities:
- Inventory and discovery of all MCP servers in use
- Per-server allow/deny policies
- Periodic permission reviews (analogous to OAuth token auditing)
- Incident response playbooks for rogue MCP server scenarios
Relationship to Existing Controls
MCP security is an extension of — but not a replacement for — API security, network monitoring, and AI firewall capabilities. The key differentiation is understanding agent intent: whether a particular MCP call is within the normal operating envelope of the agent, or represents anomalous or malicious behavior.
Standards gap
As of the paper’s publication (Oct 2025), no established standard or framework specifically governs MCP security. OWASP’s Agentic AI threats guidance covers the general threat class; MCP-specific controls are being developed by startups ahead of incumbent tools.
Where It Appears
- Securing the Autonomous Future: Trust, Safety, and Reliability of Agentic AI — primary source; introduces MCP proxy, AATO, and governance requirements
- AI Agent Identity Architecture — agent identity is exercised at the MCP boundary
- Agent Observability — MCP traffic is part of the full-stack observability requirement
Related Concepts
- AI Agent Identity Architecture — agents authenticate at MCP boundaries; SPIFFE/NHI applies
- Agent Observability — MCP call logs feed behavioral monitoring
- Non-Human Identity (NHI) — MCP access credentials must be governed as NHIs