Multi-Agent Runtime Security — Cascade Detection, Behavioral Baselines, Inter-Agent IR

The depth-companion to the wiki’s single-agent observability page, focused on what’s specific to multi-agent meshes: cascade-failure detection (OWASP ASI08), pairwise/aggregate behavioral baselines, and inter-agent incident response. Closes peer-review readiness §6 (multi-agent specifics under-architected) and the RA’s “Multi-agent failure containment” gap.

2026 is the academic-prototype era for these capabilities

Cascade detection at scale is research, not product. Graph-based monitors (SentinelAgent, TraceAegis, bi-level GAD) ship as papers, not platforms. Vendor-side coverage (Oktsec rate limits + ACLs, Aguara rule mappings, LangSmith observability) provides primitives but no integrated cascade-detection product ships with documented thresholds. The wiki documents the maturity ladder honestly: today’s L1–L2 is achievable; L3+ is research-grade.

Threat surface — what’s distinct about multi-agent

Single-agent threat models cover prompt injection, tool misuse, and output safety. Multi-agent meshes add three structural threat shapes that have no single-agent analogue:

Threat shapeOWASP anchorWhat’s distinct
Cascading failuresASI08One agent’s misbehavior or compromise propagates through the mesh — fan-out faster than human response
Insecure inter-agent communicationASI07Authentication, integrity, replay, content scanning between peer agents (the A2A surface)
Multi-agent collusion(no single ASI; threat-classes Class 3)Two or more agents coordinate to bypass an oversight that would catch either alone

The wiki’s Threat Classes 2026 §Class 3 covers collusion in detail. This page covers detection and response.

Cascade detection (ASI08)

Observable symptoms

Per OWASP ASI08 (Dec 2025) and Adversa’s ASI08 implementation guide (Jan 2026), cascade-failure symptoms cluster around six categories:

PatternSignal
Rapid fan-outTool-call rate across the mesh spikes faster than the linear sum of per-agent rates
Cross-domain spreadAn action taken in tenant A produces an action against tenant B’s data in <60 seconds
Oscillating retriesAgent X retries against agent Y in a loop, where neither’s per-call retry policy alone explains the rate
Downstream queue stormsA queue between two agents grows without bound while upstream throughput is steady
Repeated identical intentsSame goal/task hash recurs across agents that don’t normally coordinate
Cross-agent feedback loopsA’s output feeds B feeds C feeds A; either entropy of the message stream collapses or amplitude grows

Adversa lists categories, not thresholds

The Adversa guide names what to watch but provides no concrete numeric thresholds. The wiki should not invent rates. An L3+ org must establish its own thresholds from a baseline period (e.g., 30-day rolling p99 + 3σ) and tag rules with ASI08 at minimum.

Academic detection primitives

The 2025–2026 academic literature has converged on three approaches:

ApproachReferenceMechanism
Graph-walk anomalySentinelAgent (arXiv:2505.24201)Models multi-agent execution as a dynamic graph; detects anomalies at node, edge, and path levels; covers prompt injection, multi-agent collusion, latent exploit paths; pluggable runtime monitor with policy-based intervention
Provenance-basedTraceAegis (arXiv:2510.11203)Abstracts execution paths into hierarchical “stable execution units”; deviation from the unit = anomaly
Theme-basedBi-Level GAD (arXiv:2512.18733)Bi-level encoder over sentence- and token-level; flags agents drifting from the dialogue’s central theme — directly applicable to “agent A and B don’t normally talk” anomalies

Productization status: none of these ship as enterprise products as of mid-2026. They are the reference primitives the wiki’s CMM L3+ ladder is designed to be ready for.

Vendor-side primitives that exist today

Vendor / projectPrimitiveWhat it covers
Oktsec (github.com/oktsec/oktsec, v0.15.2)Per-agent sliding-window throttling + default-deny ACLs + Ed25519 message signing + tamper-evident audit chain v2 + 268 detection rulesContainment primitives — does not surface a “cascade detection” rollup
Aguara (aguarascan.com)Public ASI-to-detection-rule mapping including ASI08 rulesMapping is published; rule SQL/YAML is not
LangSmith / Langtrace / TraceloopCross-agent OTel gen_ai.* tracingObservability layer that cascade rules can be written against; rules themselves not provided

The honest read: L1 (per-agent rate limits + immutable logs) is shippable today; L2 (pairwise/triadic traffic baselines + ACL default-deny) is shippable; L3+ requires research-prototype-level work or DIY.

Multi-agent behavioral baselines

Single-agent baselines are per-agent: tool-call distributions, latency profiles, output-length histograms. Multi-agent baselines need three additional shapes:

1. Aggregate-level invariants

Mesh-wide invariants that should hold regardless of which agent is responsible for satisfying them:

  • Total tool-call rate across the mesh ≤ X (capacity invariant)
  • No agent’s outbound message rate exceeds Y per agent-pair (rate-pair invariant)
  • Total cross-tenant data-flow rate ≤ Z (compartmentalization invariant)
  • p99 inter-agent latency stays bounded under load (resource-exhaustion invariant)

2. Pairwise / triadic traffic baselines

Most agent pairs in a mesh never normally communicate. The graph of “who-talks-to-whom” is sparse. Baselining at the pair (or triadic, for relay attacks) level surfaces:

  • New edges in the agent communication graph (pair never seen before)
  • Edge intensity bursts (pair seen rarely; sudden burst)
  • Triadic pattern anomalies (A→B→C chains that never existed before)

The bi-level theme detector (Bi-Level GAD) is the academic primitive that operationalizes this for content-aware detection — “agents drifting from the dialogue’s central theme” maps to the pair / triadic departure.

3. Cross-agent drift correlation

Single-agent drift detection alerts on per-agent baseline deviation. Cross-agent drift correlation looks for agents drifting in lockstep — a signal that they are responding to a shared upstream contamination (memory poisoning, RAG poisoning, model-version regression) rather than independent variation. This is the load-bearing primitive for catching the Threat Class 4 (model-version-degradation) and the slow-poisoning sub-case of Class 2 (APT campaigns).

Inter-agent incident response

The literature is heavy on “LLM agents doing IR” (CyberSleuth, MASIR, AIR) and light on “IR for compromised LLM-agent meshes.” This is a real research gap; the wiki documents the doctrine that follows from first principles plus the available primitives.

Containment doctrine — isolate vs stop-mesh

A first-principles decision tree for the on-call IR engineer:

flowchart TD
  Start([Cascade alert fires]) --> Q1{Is the propagation rate<br/>doubling per minute?}
  Q1 -- yes --> Stop[STOP-MESH:<br/>halt all agent inter-comm via gateway]
  Q1 -- no --> Q2{Can blast radius<br/>be bounded by isolation<br/>of N agents?}
  Q2 -- yes --> Iso[ISOLATE:<br/>quarantine N agents via ACL deny;<br/>traffic shadow-recorded for forensics]
  Q2 -- no --> Stop
  Iso --> Q3{After 5 min, is propagation<br/>still active outside the quarantined set?}
  Q3 -- yes --> Stop
  Q3 -- no --> Forensics[Begin cross-agent forensics]
  Stop --> Forensics
  Forensics --> Recovery[Selective rollback /<br/>rolling restart /<br/>mesh-wide quarantine]

Default fail-mode is stop-mesh. Cost of a stopped mesh is hours of degraded service; cost of an unbounded cascade is data exfiltration, regulatory exposure, and headline incident. CSA MAESTRO’s Layer-7 framing supports this default; OWASP ASI08 mitigations name “kill switches” without specifying granularity.

Cross-agent forensics

Single-agent forensics traces one tool-call chain through one agent’s hooks and OTel spans. Cross-agent forensics requires:

  • Causal correlation across agent boundaries. OTel gen_ai.* spans must propagate trace IDs across A2A messages. Today this is implementation-specific; A2A v1.0 does not mandate it.
  • Tamper-evident audit chain across agents. Oktsec’s audit chain v2 is a reference primitive — single-vendor; no cross-vendor standard.
  • Reconstruction of the message-graph at incident time. Graph state must be queryable retroactively; LangSmith and Langtrace cover this for their stacks; cross-stack correlation is unsolved.
  • Provenance for content (RAG documents, prompts, tool definitions) referenced in the incident — so the IR can determine whether the cascade started from poisoned content vs adversarial input vs operator error.

Recovery

Three recovery shapes, in increasing severity:

ShapeWhenMechanism
Selective rollbackOne agent compromised; others healthyRotate that agent’s identity; rebuild from clean state; preserve audit
Rolling restartCascade contained; healthy fraction knownRestart agents in dependency order; verify per-agent invariants between batches
Mesh-wide quarantineCascade exceeded containmentAll agents offline; full forensic snapshot; clean rebuild from version-pinned artifacts; release to canary subset before full mesh

Maturity ladder for multi-agent runtime security

Honest reflection of the gap between today’s primitives and full coverage:

TierWhat’s in place
L1Per-agent rate limits + immutable per-agent logs (Oktsec-class today). Mesh has rate primitives but no aggregate awareness
L2Pairwise/triadic traffic baselines + ACL default-deny + cross-agent OTel tracing. Mesh has pair-level visibility
L3Graph-walk anomaly detection (SentinelAgent-class, prototype) + cross-agent drift correlation + documented stop-mesh-vs-isolate doctrine + ID-tagged ASI07/ASI08 evidence per detection
L4Activation-level collusion probes (NARCBench-class, research) + automated containment doctrine + multi-agent forensics with causal correlation + multi-tool red-team eval covering ASI07/ASI08/ASI10
L5Formally specified mesh invariants + automated cross-agent IR with selective recovery + standards contributions (OWASP, CoSAI, A2A spec). Does not exist in production as of mid-2026 — this is the research target

This ladder integrates with the wiki’s CMM D7 L3+ evidence requirements. L3 is where pairwise baselines + stop-mesh doctrine become assessable; L4 is where multi-tool red-team eval covering multi-agent surface becomes assessable; L5 remains aspirational.

Open issues

Where the field is thin

  1. No vendor cascade-detection rules with concrete numeric thresholds. Adversa and OWASP describe categories; rule SQL/YAML is not public.
  2. No published post-mortems on production multi-agent cascade incidents as of mid-2026 — the empirical anchor is missing.
  3. Containment doctrine (isolate-one vs stop-mesh) is unresolved in literature. The IR papers describe specialist agent teams doing IR, not IR doctrine for meshes.
  4. Cross-agent log correlation lacks a standard schema. OpenTelemetry gen_ai.* is the closest; cross-vendor correlation is implementation-specific.
  5. MITRE ATLAS multi-agent TTPs are thin compared to the prompt-injection coverage. ASI07/08/10 anchors exist; technique-level inventory does not.

Mapping to RA + CMM

ConcernRA planeCMM domain
Inter-agent message securityEgress (PEP)D5 Egress & Network
Cascade detectionObservability (PIP feeding PDP)D7 Observability & Detection
Multi-agent behavioral baselinesObservabilityD7
Containment doctrineControl (PDP + PAP)D3 Control & Least-Agency
Inter-agent IR runbookCross-cuttingD9 Operations & Human Factors

See Also