Behavioral Anomaly Detection for Agents
Behavioral anomaly detection for agents is the practice of profiling what normal looks like for autonomous AI agents — at the agent level, the user level, and (in multi-tenant contexts) the organization level — and alerting on statistically significant deviations from that baseline. It is the operational form of the Agent Observability practice’s insider-threat framing, applied specifically to agentic AI workloads.
Why this differs from traditional UEBA
Classical User and Entity Behavior Analytics (UEBA) profiles humans and machines at the user level. Agentic AI introduces a structurally new identity axis: the agent itself. Agents have more constrained, predictable behavior than humans (their logic is typically custom-built per deployment), which means agent-level baselines yield lower false-positive rates than user-level baselines alone. Combining three levels — user / agent / org — into an ensemble model produces higher-fidelity signals than any single axis.
The three-level context model
First described in production by Matt Rittinghouse at [[unprompted-conference-march-2026|[un]prompted March 2026]], the model defines three identity axes for anomaly detection:
| Level | Entity | Key insight |
|---|---|---|
| 1 | User | Established UEBA baseline — what does this human normally do? |
| 2 | Agent | New axis — what does this specific agent normally do, given its deployment context? |
| 3 | Organization | Multi-tenant context — what is normal for agents in this tenant organization? |
Anomaly signals from each level are combined in an ensemble model, where correlated deviations across multiple axes substantially reduce noise. A single-axis anomaly may reflect legitimate edge-case behavior; a three-axis anomaly is statistically near-impossible by chance.
Feature design principles
Effective behavioral features for agentic AI detection focus on data access patterns along two dimensions:
Depth features (sensitive data targeting):
- Frequency of database calls per session
- Proportion of total data accessed vs. normal baseline
- Column/field-level access to PII-flagged or rarely-accessed attributes
Breadth features (exfiltration indicators):
- Number of distinct tables, columns, or records touched
- Ratio of accessed data to normal session scope
- Sensitivity scoring of accessed fields based on access rarity
Temporal features:
- Incremental historical profiling (rolling daily baselines)
- Standard-deviation-based scoring against the rolling baseline
- New-agent warm-up period (typically ~14 days) during which baselines are established
What does not work
Two documented failures from the Salesforce production deployment inform current best practice:
-
Query complexity scoring — Intuitive but misleading in agentic contexts because agents construct their own queries. Complexity reflects agent design choices, not malicious intent. Measuring query complexity confuses the agent identity layer with the malicious-actor layer.
-
Multi-table joins for feature context — Joining metadata from multiple tables to build features introduces expensive computational overhead. Refactoring to single-table operation reduced model training time by ~67% at Salesforce.
The meta-principle: features must measure what you think they are measuring. Validate feature predictive contribution (e.g., PCA-style feature importance analysis) before investing in implementation.
Production evidence
The primary production case study is Salesforce’s Agentforce deployment, described in “1.8M Prompts, 30 Alerts” ([un]prompted March 2026):
- Scale: ~1.8M daily prompts, 12,000+ unique daily active agents, 55,000 tenant organizations
- Output: fewer than 30 actionable daily alerts
- Signal-to-noise ratio: approximately 60,000:1
- Detection latency (current): 12–24 hours (batch processing)
- Detection latency (target): real-time / in-flight scoring via hot-path architecture
Relationship to the observability plane
Behavioral anomaly detection occupies the detection layer of the Agentic AI Security Reference Architecture’s Observability plane. It depends on:
- Identity telemetry (§3 of Agent Observability) — structured logs linking
invoking_user_id→agent_id→ action. Without this, behavioral signals cannot be attributed. - Baseline infrastructure — incremental historical profiling at the agent, user, and org levels. The baseline is meaningless without the identity linkage.
It feeds:
- Alert triage pipeline — the scored alerts feed a secondary system (potentially an LLM explainer agent) for SOC consumption.
- Auto-containment tier (roadmap) — when deviation crosses a “statistically impossible” threshold, automated response (session kill, token revocation, bot lockdown) executes without SOC triage.
Auto-containment roadmap
The Salesforce architecture illustrates a three-stage progression toward autonomous response:
- Batch detection → alert queue for SOC triage (current state)
- Hot-path inference → real-time session scoring against cached baselines
- Inline auto-containment → automated kill / revoke / lockdown when threshold crossed; purple-team exercise required before rollout
This progression is the operational instantiation of the “confident automated response” goal: moving from knowing what happened yesterday to stopping what is happening now.
Warm-up period formalization
The 14-day warm-up period for new agents is mentioned by Salesforce practitioners as a SOC-playbook requirement, but no published study has characterized the optimal warm-up duration across different agent types and workloads. This is an open calibration question.
See also
- Agent Observability — parent practice; this concept is its §7 (Agent Behavioral Monitoring) instantiation
- Multi-Agent Runtime Security — multi-agent extension of the same principles
- Prompt-Volume-to-Alert Ratio — the metric this practice produces
- Non-Human Identity (NHI) — identity architecture that feeds the attribution layer
- Oversight Layer (PDP + PEP for Agentic AI) — policy enforcement layer that acts on detection outputs