CaMeL Pattern (Compartmentalized Machine Learning)
CaMeL (Compartmentalized Machine Learning) is a research-stage architectural pattern from Google DeepMind (March 2025) for defending agentic systems against prompt injection. The core idea: split the agent’s LLM into two separate models with different trust levels and information-flow rules between them.
The split
┌─────────────────────────────┐
│ PRIVILEGED LLM │ ← Receives user instructions (trusted)
│ (coordinates workflow) │ ← Has access to high-trust tools
│ Sees: user intent only │
└──────────────┬──────────────┘
│ structured, stripped commands only
▼
┌─────────────────────────────┐
│ QUARANTINED LLM │ ← Receives untrusted content (web, docs, email)
│ (executes retrieval tasks) │ ← Can only return structured data, no free text
│ Sees: retrieved content │
└─────────────────────────────┘
The quarantined LLM handles all interaction with untrusted content (web pages, emails, documents, RAG retrievals). Its output channel is restricted: it can only return structured data (JSON, typed values) back to the privileged LLM, not free-form text that could carry injected instructions. The privileged LLM never directly processes untrusted content, so injected instructions cannot reach it.
Why this matters for the Lethal Trifecta
The Lethal Trifecta requires: (1) private data access, (2) untrusted content ingest, (3) external communication — all in one agent. CaMeL’s split addresses leg 2 by preventing injected content from ever reaching the LLM that holds legs 1 and 3.
Compared to trifecta-splitting strategies (e.g., separate agents per trifecta leg), CaMeL:
- Keeps the capability in one logical system — the user gets the full agentic experience
- Enforces the split at the model level — not just at the tool-call or prompt-architecture level, which can be bypassed by clever injection
- Allows formal analysis — because information flows are typed and restricted, the quarantine boundary can be formally verified
Research status
CaMeL was published in March 2025 (arXiv 2503.12599) by Google DeepMind. As of May 2026:
- No shipped production implementation exists outside Google’s internal research environment
- CMM D4 L5 and D3 L5 reference it as a research-stage primitive that qualifies for L5 credit only when run as a documented pilot with exit criteria
- The pattern requires significant prompt engineering and data-flow enforcement work; “just use two LLMs” is not sufficient — the structured output channel is the load-bearing control
Comparison to alternative containment approaches
| Approach | Where it breaks the trifecta | Shipped? |
|---|---|---|
| Trifecta splitting (per-agent role) | Removes legs from individual agents | Yes — architectural guidance |
| Egress filtering (Stripe approach) | Removes leg 3 (external comms) at network | Yes — Smokescreen |
| Capability warrants (Tenuo approach) | Constrains what legs 1+3 can do per task | Yes — Tenuo OSS |
| CaMeL (DeepMind) | Prevents injected content from reaching legs 1+3 | Research-stage only |
| HITL on sensitive actions | Inserts human into leg 3 path | Yes — architectural primitive |
In the RA / CMM
- RA Runtime Plane: CaMeL is listed as “Compartmentalized LLMs (CaMeL pattern)” — classified as
Research. - CMM D3 L5: “Compartmentalized LLM (CaMeL pattern) for trifecta-positive workloads” — evidence requires a deployed pilot with documented exit criteria.
- CMM D4 L5 (b): CaMeL-style privileged/quarantined LLM split is one of three research-stage primitives that can qualify for D4 L5.
See also
- Lethal Trifecta — the structural condition CaMeL defends against
- Indirect Prompt Injection — the attack that CaMeL constrains
- Agent Sandboxing — OS-level complement; Firecracker can isolate the quarantined LLM process
- Firecracker — the recommended OSS sandbox for isolating the quarantined LLM
- Agentic AI Security RA §Runtime plane