Behavioral Anomaly Detection for Agents

Behavioral anomaly detection for agents is the practice of profiling what normal looks like for autonomous AI agents — at the agent level, the user level, and (in multi-tenant contexts) the organization level — and alerting on statistically significant deviations from that baseline. It is the operational form of the Agent Observability practice’s insider-threat framing, applied specifically to agentic AI workloads.

Why this differs from traditional UEBA

Classical User and Entity Behavior Analytics (UEBA) profiles humans and machines at the user level. Agentic AI introduces a structurally new identity axis: the agent itself. Agents have more constrained, predictable behavior than humans (their logic is typically custom-built per deployment), which means agent-level baselines yield lower false-positive rates than user-level baselines alone. Combining three levels — user / agent / org — into an ensemble model produces higher-fidelity signals than any single axis.

The three-level context model

First described in production by Matt Rittinghouse at [[unprompted-conference-march-2026|[un]prompted March 2026]], the model defines three identity axes for anomaly detection:

Level	Entity	Key insight
1	User	Established UEBA baseline — what does this human normally do?
2	Agent	New axis — what does this specific agent normally do, given its deployment context?
3	Organization	Multi-tenant context — what is normal for agents in this tenant organization?

Anomaly signals from each level are combined in an ensemble model, where correlated deviations across multiple axes substantially reduce noise. A single-axis anomaly may reflect legitimate edge-case behavior; a three-axis anomaly is statistically near-impossible by chance.

Feature design principles

Effective behavioral features for agentic AI detection focus on data access patterns along two dimensions:

Depth features (sensitive data targeting):

Frequency of database calls per session
Proportion of total data accessed vs. normal baseline
Column/field-level access to PII-flagged or rarely-accessed attributes

Breadth features (exfiltration indicators):

Number of distinct tables, columns, or records touched
Ratio of accessed data to normal session scope
Sensitivity scoring of accessed fields based on access rarity

Temporal features:

Incremental historical profiling (rolling daily baselines)
Standard-deviation-based scoring against the rolling baseline
New-agent warm-up period (typically ~14 days) during which baselines are established

What does not work

Two documented failures from the Salesforce production deployment inform current best practice:

Query complexity scoring — Intuitive but misleading in agentic contexts because agents construct their own queries. Complexity reflects agent design choices, not malicious intent. Measuring query complexity confuses the agent identity layer with the malicious-actor layer.
Multi-table joins for feature context — Joining metadata from multiple tables to build features introduces expensive computational overhead. Refactoring to single-table operation reduced model training time by ~67% at Salesforce.

The meta-principle: features must measure what you think they are measuring. Validate feature predictive contribution (e.g., PCA-style feature importance analysis) before investing in implementation.

Production evidence

The primary production case study is Salesforce’s Agentforce deployment, described in “1.8M Prompts, 30 Alerts” ([un]prompted March 2026):

Scale: ~1.8M daily prompts, 12,000+ unique daily active agents, 55,000 tenant organizations
Output: fewer than 30 actionable daily alerts
Signal-to-noise ratio: approximately 60,000:1
Detection latency (current): 12–24 hours (batch processing)
Detection latency (target): real-time / in-flight scoring via hot-path architecture

Relationship to the observability plane

Behavioral anomaly detection occupies the detection layer of the Agentic AI Security Reference Architecture’s Observability plane. It depends on:

Identity telemetry (§3 of Agent Observability) — structured logs linking invoking_user_id → agent_id → action. Without this, behavioral signals cannot be attributed.
Baseline infrastructure — incremental historical profiling at the agent, user, and org levels. The baseline is meaningless without the identity linkage.

It feeds:

Alert triage pipeline — the scored alerts feed a secondary system (potentially an LLM explainer agent) for SOC consumption.
Auto-containment tier (roadmap) — when deviation crosses a “statistically impossible” threshold, automated response (session kill, token revocation, bot lockdown) executes without SOC triage.

Auto-containment roadmap

The Salesforce architecture illustrates a three-stage progression toward autonomous response:

Batch detection → alert queue for SOC triage (current state)
Hot-path inference → real-time session scoring against cached baselines
Inline auto-containment → automated kill / revoke / lockdown when threshold crossed; purple-team exercise required before rollout

This progression is the operational instantiation of the “confident automated response” goal: moving from knowing what happened yesterday to stopping what is happening now.

Warm-up period formalization

The 14-day warm-up period for new agents is mentioned by Salesforce practitioners as a SOC-playbook requirement, but no published study has characterized the optimal warm-up duration across different agent types and workloads. This is an open calibration question.

Enterprise Security in the Agentic AI Era

Explorer

Behavioral Anomaly Detection for Agents

Behavioral Anomaly Detection for Agents

The three-level context model

Feature design principles

What does not work

Production evidence

Relationship to the observability plane

Auto-containment roadmap

See also

Graph View

Table of Contents

Backlinks