CaMeL Pattern (Compartmentalized Machine Learning)

CaMeL (Compartmentalized Machine Learning) is a research-stage architectural pattern from Google DeepMind (March 2025) for defending agentic systems against prompt injection. The core idea: split the agent’s LLM into two separate models with different trust levels and information-flow rules between them.

The split

┌─────────────────────────────┐
│  PRIVILEGED LLM             │  ← Receives user instructions (trusted)
│  (coordinates workflow)     │  ← Has access to high-trust tools
│  Sees: user intent only     │
└──────────────┬──────────────┘
               │ structured, stripped commands only
               ▼
┌─────────────────────────────┐
│  QUARANTINED LLM            │  ← Receives untrusted content (web, docs, email)
│  (executes retrieval tasks) │  ← Can only return structured data, no free text
│  Sees: retrieved content    │
└─────────────────────────────┘

The quarantined LLM handles all interaction with untrusted content (web pages, emails, documents, RAG retrievals). Its output channel is restricted: it can only return structured data (JSON, typed values) back to the privileged LLM, not free-form text that could carry injected instructions. The privileged LLM never directly processes untrusted content, so injected instructions cannot reach it.

Why this matters for the Lethal Trifecta

The Lethal Trifecta requires: (1) private data access, (2) untrusted content ingest, (3) external communication — all in one agent. CaMeL’s split addresses leg 2 by preventing injected content from ever reaching the LLM that holds legs 1 and 3.

Compared to trifecta-splitting strategies (e.g., separate agents per trifecta leg), CaMeL:

  • Keeps the capability in one logical system — the user gets the full agentic experience
  • Enforces the split at the model level — not just at the tool-call or prompt-architecture level, which can be bypassed by clever injection
  • Allows formal analysis — because information flows are typed and restricted, the quarantine boundary can be formally verified

Research status

CaMeL was published in March 2025 (arXiv 2503.12599) by Google DeepMind. As of May 2026:

  • No shipped production implementation exists outside Google’s internal research environment
  • CMM D4 L5 and D3 L5 reference it as a research-stage primitive that qualifies for L5 credit only when run as a documented pilot with exit criteria
  • The pattern requires significant prompt engineering and data-flow enforcement work; “just use two LLMs” is not sufficient — the structured output channel is the load-bearing control

Comparison to alternative containment approaches

ApproachWhere it breaks the trifectaShipped?
Trifecta splitting (per-agent role)Removes legs from individual agentsYes — architectural guidance
Egress filtering (Stripe approach)Removes leg 3 (external comms) at networkYes — Smokescreen
Capability warrants (Tenuo approach)Constrains what legs 1+3 can do per taskYes — Tenuo OSS
CaMeL (DeepMind)Prevents injected content from reaching legs 1+3Research-stage only
HITL on sensitive actionsInserts human into leg 3 pathYes — architectural primitive

In the RA / CMM

  • RA Runtime Plane: CaMeL is listed as “Compartmentalized LLMs (CaMeL pattern)” — classified as Research.
  • CMM D3 L5: “Compartmentalized LLM (CaMeL pattern) for trifecta-positive workloads” — evidence requires a deployed pilot with documented exit criteria.
  • CMM D4 L5 (b): CaMeL-style privileged/quarantined LLM split is one of three research-stage primitives that can qualify for D4 L5.

See also