Enterprise Security in the Agentic AI Era

A research wiki on the bidirectional intersection of agentic AI and enterprise cybersecurity — frameworks, reference architectures, emerging practices, and maturity modeling. Maintained by Anton Goncharov; ingested from primary sources (papers, talks, vendor research, incident reports) and synthesized into cross-linked pages.

Scope (three axes, six scope-axis values)

The wiki covers three bidirectional axes between agentic AI and cybersecurity. Every page declares its scope_axis: frontmatter; the closed vocabulary is documented in conventions §Scope Axes.

  1. Security OF AI — frameworks, reference architectures, and maturity models for safely deploying AI agents in production. Anchored by the Agentic AI Security CMM (5 × 9, cumulative, ID-tagged evidence) and the six-plane Agentic AI Security Reference Architecture. Scope axes: sec-of-ai.
  2. AI IN security — agentic systems used by defenders (SOC automation, autonomous triage) and offensive operators (AI-assisted exploitation, autonomous pentest), plus frontier-model-driven vulnerability discovery. Scope axes: ai-in-sec-defense, ai-in-sec-offense, redteam-for-ai, ai-vuln-discovery.
  3. Security AGAINST AI-driven attacks — how SDLC, supply chain, identity, and operational security must evolve when adversaries have frontier AI capability. Scope axes: sec-against-ai.

Start Here

  1. State of the Field — the prose tour of where the wiki stands across all three axes.
  2. Agentic AI Security Reference Architecture (2026) — the six-plane RA. Read this for the structural model on the sec-of-ai axis.
  3. Agentic AI Security CMM 2026 — the 5 × 9 maturity model. Read this to assess or plan a security-of-AI program.
  4. Per-axis synthesis pages:
  5. CMM Standards Crosswalk · CMM Measurement Protocol · CMM Dependency Rules — companion pages: how the CMM maps to NIST/ISO/MITRE/OWASP, how to actually score an org, and the cross-domain dependency caps.
  6. Hot Cache · Log — what’s been added recently and why.

What’s Load-Bearing

Two anchor artifacts and six scope axes. Everything else — concepts, practices, papers, entities, incidents, comparisons — is cross-linked back to those:

  • Reference Architecture6-plane RA (Identity / Control / Runtime / Egress / Data / Observability).
  • Capability Maturity Model5 × 9 cumulative CMM (D1 Governance / D2 IAM / D3 Supply Chain / D4 Guardrails / D5 Secure Architecture / D6 Data, Memory & RAG / D7 Observability / D8 Operational Resilience / D9 Continuous Compliance).
  • Adjacent maturity frameworksPwC Stage-Coverage Tiers for GenAI-in-SDLC adoption breadth (orthogonal axis); Cybersecurity CMM Exemplars for the BSIMM / SAMM / CMMI / CMMC / NIST CSF 2.0 design lessons that shaped the wiki’s CMM.

Overview

  • Overview — state of the field synthesis (seed)

Frameworks

See Frameworks Index.

  • CLASP — capability-centric evaluation rubric for autonomous security agents
  • Red Teaming Capability Framework — five-tier layered model for first-party agentic AI red teaming
  • NIST AI RMF — risk-management framework, four-function model (Govern, Map, Measure, Manage)
  • NIST AI 600-1 — Generative AI Profile addendum to NIST AI RMF
  • NIST SP 800-218A — SSDF AI Profile (stub)
  • MITRE ATLAS — adversarial-ML threat taxonomy; jumped 66→84 techniques in Q1 2026
  • OWASP LLM Top 10 — model-layer risks; prompt injection #1
  • OWASP Agentic AI Top 10 (ASI) — agent-orchestration-layer risk taxonomy
  • OWASP AIVSS — vulnerability scoring system extending CVSS 4.0 with agentic amplification
  • IEC 42001 — first certifiable AI management system standard
  • Google SAIF — secure-by-design lifecycle framework, donated to CoSAI
  • CoSAI Principles — Coalition for Secure AI principles and outputs
  • Microsoft RAI — responsible-AI standard with operational tooling
  • CSA MAESTRO — 7-layer threat model for cross-layer agentic AI analysis
  • EU AI Act — risk-tiered EU regulation, enforcement Aug 2026
  • A2A Protocol — Agent-to-Agent v1.0.0 (Mar 2026); Linux Foundation-governed since June 2025; transport security + Agent Card signing in spec; message integrity + replay + cross-agent delegation are vendor / proposal-side
  • Cyber Defense Matrix — Sounil Yu’s 5×5 NIST-CSF-vs-asset-class matrix (stub)
  • Gartner AI TRiSM — analyst-defined market category; expanded with Guardian Agents in Feb 2026
  • AIUC-1 — first independent security/safety/reliability certification standard for AI agents; six pillars; quarterly updates; Schellman first accredited auditor (Feb 2026); UiPath first enterprise-automation cert
  • OpenTelemetry gen_ai.* Semantic Conventions — CNCF-graduated standard (v1.37+, experimental status); vendor-neutral agent observability foundation; gen_ai.* spans cover model calls, tool calls, RAG retrievals, agent steps; no license cost; mandatory baseline for CMM D7 L3; SIG contributors include Amazon, Elastic, Google, IBM, Microsoft
  • XACML — eXtensible Access Control Markup Language — OASIS standard (3.0, 2013); historical lineage of the four-role architecture (PEP / PDP / PIP / PAP); language layer is dormant (superseded by Cedar / OPA); architectural-role layer is alive (anchored in NIST SP 800-162 + NIST SP 800-207)
  • NIST SP 800-162 — Guide to Attribute Based Access Control (ABAC) — NIST publication (2014, reaffirmed 2019); the wiki’s preferred living-standard citation for the four-role vocabulary (PEP / PDP / PIP / PAP). Generalizes XACML’s role split into a policy-language-agnostic NIST reference. Use for any role-vocabulary or role-architecture absence claim per the Standards Validation Methodology
  • AWS Agentic AI Security Scoping Matrix — AWS Security blog (Nov 2025); four-scope agency/autonomy ladder (No / Prescribed / Supervised / Full agency) plus six security dimensions; the wiki’s anchor citation for the agency-vs-autonomy terminology distinction. Crosswalks to wiki CMM L1–L5+, CSA ATF Intern → Principal, and OWASP four-tier least-agency
  • MAAIS — Multilayer Agentic AI Security Framework — Arora & Hastings, arXiv preprint (Dec 2025); seven-layer defense-in-depth framework + the CIAA augmentation (CIA + Accountability) for agentic AI; tactic-level MITRE ATLAS validation; the wiki’s anchor citation for CIAA as the augmented security triad for autonomous systems
  • Microsoft ZT4AI — Zero Trust for AI — Microsoft’s adaptation of Zero Trust principles to AI systems and agentic deployments; ~700-control control set; March 2026 reference-architecture / workshop / assessment-tool refresh; pillars include Identity / Data / Devices / Apps & Workloads / Network / Visibility-Automation-Orchestration / Governance; vendor-locked to the Microsoft Security stack but principles transfer
  • Microsoft Secure Development Lifecycle (SDL) — Microsoft’s foundational secure-by-design SDLC framework (2004); influences NIST SSDF lineage; 2026-02-03 AI extension announced six SDL-for-AI focus areas (threat modeling for AI / AI observability / AI memory protections / agent identity & RBAC / AI model publishing / AI shutdown mechanisms) and six operating pillars (research / policy / standards / enablement / cross-functional collaboration / continuous improvement); first major-vendor secure-SDLC framework with explicit AI extension scope; framed as “a way of working, not a checklist”
  • NIST SSDF — Secure Software Development Framework (SP 800-218 v1.1) — NIST’s outcomes-oriented secure-SDLC framework (Feb 2022; Souppaya/Scarfone/Dodson); four practice groups (PO/PS/PW/RV), 19 practices, ~50 tasks; explicitly cites Microsoft SDL (MSSDL) as a named source for 8+ tasks; federal regulatory anchor under EO 14028 and OMB M-22-18; mandatory self-attestation for vendors selling to US federal government; vendor-neutral common-vocabulary instrument with cross-walks into BSAFSS / BSIMM / IEC 62443 / ISO 27034 / OWASP ASVS/MASVS/SAMM / NIST CSF / SP 800-53/160/161/181
  • NIST SP 800-218A — SSDF Community Profile for Generative AI and Dual-Use Foundation Models — NIST’s AI-specific extension of SSDF (July 2024; Booth/Souppaya/Vassilev/Ogata/Stanley(CISA)/Scarfone); authorized by EO 14110 § 4.1.a; adds 1 new practice (PW.3 Confirm Integrity of Training Data) with 3 sub-tasks, 3 net-new tasks (PO.5.3 continuous monitoring, PS.1.2 protect training data, PS.1.3 protect model weights), and AI-specific R/C/N additions with High/Medium/Low priority across ~30 existing tasks; informative references into NIST AI RMF, OWASP LLM Top 10, and NIST AI 100-2e2023 Adversarial ML Taxonomy; federal-anchor citation for CMM D4/D5/D6/D7/D9
  • OSFI Guideline B-13 — Technology and Cyber Risk Management — Canada’s federal regulatory expectations document for technology and cyber risk at federally-regulated financial institutions (effective 2022-07-31); three domains (Governance and Risk Management; Technology Operations and Resilience; Cyber Security); 17 high-level expectations; §2.4 System Development Life Cycle is the direct regulatory hook for Canadian-bank secure-SDLC; §3.1.2 explicitly names penetration testing and red teaming; §3.4 covers respond/recover/learn including forensic investigation expectations
  • OSFI Guideline E-23 (2027) — Model Risk Management — Canada’s federal regulatory framework for enterprise-wide model risk management (published 2025-09-11; effective 2027-05-01); explicit AI/ML scope including “black box” approaches, autonomous decision-making, and re-parametrization / drift detection; three principle sections (B Enterprise-wide MRM / C Risk-Based Approach / D Lifecycle Management) plus Appendix A inventory schema; six-stage model lifecycle; explicit cross-reference to OSFI B-13 at the Deployment stage; the Canadian peer to US SR 11-7

Reference Architectures

See Architectures Index.

Practices

See Practices Index.

  • Agent Observability — glass-box paradigm: hooks, OTel, identity multiplexing, Cedar policies, agent behavioral monitoring (insider-threat framing)
  • Agent Sandboxing — OS-level isolation as last-line-of-defense against goal manipulation and command injection
  • NHI Governance for AI Agents — credential lifecycle management for autonomous-agent identity sprawl
  • Credential Proxy Pattern — proxy-token / vault-injection to keep credentials out of agent context
  • Prompt Injection Containment — two-layer detection + platform-level tool-call interception
  • Supply Chain Security for Agents — skill marketplace controls, cognitive file integrity, Brain Git rollback
  • AI-BOM — AI Bill of Materials; runtime discovery, behavioral baselines
  • Securing AI Talking Points — four-point briefing outline
  • Guardian Agent Metagovernance — “Guards for the Guardians” — five controls governing guardian agents themselves
  • RAG Hardening — per-source boundary markers, injection scanning, source-trust attribution, action-source coupling
  • AI Security Posture Management (AI-SPM) — continuous AI-asset inventory + misconfiguration detection (models, prompts, indexes, connectors)
  • Data Security Posture Management (DSPM) for AI — sensitive-data mapping that feeds AI guardrails so risky sources are excluded at query time
  • Oversharing Controls for AI Search — knowledge-layer need-to-know boundaries for Microsoft Copilot, Glean, Gemini
  • Agent Token Chargeback — variable chargeback infrastructure for agentic-AI token spend; FinOps-for-agents primitive of the AI Agent Layered Council
  • Distributed Kill Switch — one-vote-veto pattern; halt-authority distributed to every team member in the loop; organizational counterpart to least-agency block tier
  • Multi-Agent Runtime Security — depth-companion to single-agent observability; cascade detection (ASI08), pairwise/aggregate behavioral baselines, inter-agent IR doctrine; honest about 2026 academic-prototype era
  • Plan-Validate-Execute Pattern — Google Workspace’s canonical HITL pattern for high-stakes irreversible actions; structured plan → deterministic gatekeeper → execute; addresses recursive-injection failure of LLM-based reviewers
  • Anti-Patterns and Failure Modes — 25 catalogued ways the RA + CMM go wrong in operation across 9 categories (architecture, CMM scoring, operations, threat-model, standards, identity, multi-agent, procurement, talent); the wiki’s BSIMM-activities-not-undertaken / CMMC-appeals / SAMM-scoring-caveats analog

Maturity Models

See Maturity Models Index.

Canonical CMM (use this for new work):

Adjacent maturity frameworks:

  • PwC Stage-Coverage Tiers — 4-archetype maturity model (Observer/Experimenter/Integrator/Pioneer) for GenAI-in-SDLC adoption breadth (not security maturity); orthogonal-axis to the Agentic AI Security CMM

Papers

See Papers Index.

  • AI Security Standards in Q1 2026 — framework gap analysis: agentic threats outpace standards bodies; OSS controls (LlamaFirewall, AgentGateway) ahead of frameworks
  • Securing the Autonomous Future — Insight Partners (2025); five-category agentic-AI security market map; agent identity architecture; NHI, MCP security, “UEBA for Agents” coining (origin of the colloquial term)
  • Emerging Cybersecurity Practices for Agentic AI Applications — Goncharov (2026); OpenClaw-ecosystem analysis; seven security domains mapped to OWASP ASI Top 10
  • Gartner Market Guide for Guardian Agents (Feb 2026) — Litan, Plummer et al.; defines guardian agent category; Sentinels/Operatives split; metagovernance; vendor segmentation
  • Securing Your Agents — McIntyre (2026); 40-slide layered-defense playbook covering inputs, prompts, outputs, infra, and red-teaming
  • AI Coding Agent Governance — Knostic (2026); governance ≠ security; four components (identity / scoping / approval / audit) and three-phase rollout; introduces shadow automation framing
  • What Are Non-Human Identities? — Oasis Security (2026); NHI taxonomy (11 types), HR-vs-code-pace lifecycle mismatch, identity-credential coupling, six-pillar securing-NHI strategy
  • AI Data Security — Knostic (2026); inference vs retrieval exposure, AI-UC, AI-SPM/DSPM stack, eight enterprise-AI risks, knowledge-layer governance
  • Scaling Agentic AI: A Leadership Guide for CIOs — Gartner webinar (Gummer + Gulzar, May 2026); operating-model counterpart to the Guardian Agents Market Guide; introduces the AI Agent Layered Council and the human-parity-line crossing
  • [[unprompted-conference-march-2026|[un]prompted Conference — AI Security Practitioner Conference (March 3–4, 2026)]] — full talks catalog with presenters/orgs/notable data points across ~52 talks; companion ranking page in Comparisons. (Source page header says “[un]prompted II” but body announces September 2026 as II — slug uses date to dodge the ambiguity.)
  • Breaking the Lethal Trifecta (Without Ruining Your Agents) — Bullen, Stripe — first individual-talk ingest from [un]prompted: slides + transcript combined; introduces the lethal-bifecta, names smokescreen + toolshed; canonical practitioner worked example for lethal-trifecta containment.
  • Capability-Based Authorization for AI Agents — Niyikiza, Tenuo — [un]prompted March 2026; paired slides + transcript; warrant primitive with 6 properties; monotonic attenuation (W₂⊆W₁⊆W₀); 4 deployment modes; 90%→0% multi-agent ASR on custom harness; partial closure of the PDP/PEP gap for delegation-aware actions.
  • Securing Workspace GenAI at Google — Lidzborski — [un]prompted March 2026; three-year Google Workspace retrospective; introduces prompt-as-code structural framing, agency-gap / orchestration-hijacking / recursive-prompt-injection threat sub-classes, sentinel-tokens for prompt delimitation, “Architecting the Fortress” 4-layer blueprint, and Plan-Validate-Execute as canonical HITL pattern; architectural sibling to Bullen’s Stripe talk (Bullen: egress/tool-policy; Lidzborski: input/orchestration/output).
  • Glass-Box Security — Hurd, Starseer — [un]prompted March 2026; mechanistic interpretability hooks (cosine similarity to known-concept directions, scalar projection for strength) added as a second detection layer below plaintext eBPF/regex; YARA-style semantic tripwires at the residual-stream level; pairs with mechanistic-interpretability-for-defense and glass-box-security concept pages.
  • Guardrails Beyond Vibes — Zhang & Shah, Stripe — [un]prompted March 2026; second Stripe talk; offline-eval pipeline + LLM-as-Judge + golden-standard test cases; architecture-matched-to-task (sequential multi-agent for threat modeling, single-minimal-toolset for triage); AlphaEvolve evolutionary prompt search failed at Stripe’s cost frontier; pairs with Bullen (containment vs. quality — complete inner loop).
  • Hooking Coding Agents with Cedar — Maisel, Sondera — [un]prompted March 2026; deterministic hook-based reference monitor for coding-agent lifecycle events (action / observation / control / state) routed through Cedar; YARA + IFC labels + safety model; orthogonal to Niyikiza (Niyikiza: delegation-time auth via warrants; Maisel: per-action enforcement for standalone agents).
  • Building Secure Agentic Systems — McMillin, Dropbox — [un]prompted March 2026; 19-agent / 73-tool home-lab fleet uncovers two novel failure modes: cross-agent memory contamination via shared namespace (fix: class-name-keyed isolation), and the N-token attack-hiding window during context trimming (fix: tagged + pinned security events); first practitioner-derived concepts at the agent-fleet level.
  • “Your Agent Works for Me Now” — Rehberger — [un]prompted March 2026; reframes prompt injection as “promptware” (multi-stage NL malware with persistence, C2, exfiltration, lateral movement); discloses delayed-tool-invocation bypass technique that defeats platform-level tool-confirmation by deferring activation to a later turn.
  • 1.8M Prompts, 30 Alerts — Rittinghouse, Salesforce — [un]prompted March 2026; production-scale telemetry from Salesforce Agentforce: 12,000+ daily-active agents across 55,000 tenants reduced via three-level ensemble behavioral anomaly detection to ≤30 actionable daily alerts; introduces the prompt-volume-to-alert-ratio metric.
  • Beyond the Chatbot — Smith & Sharma, Salesforce — [un]prompted March 2026 (stub-summary, abstract-only); companion talk to Rittinghouse: the response half of the Agentforce SOC story; argues for a Polyphonic (Supervisor-Worker) architecture against monolithic black-box copilots; promote when slides / transcript captured.
  • AI Agents Are Here. So Are the Threats. — Chen & Lu, Unit 42 (May 2025) — first systematic empirical study of framework-agnostic agentic-AI vulnerabilities; 9 attack scenarios on functionally identical apps built on CrewAI + AutoGen (open-source reference impl on GitHub); 5 mitigation strategies (prompt hardening, content filtering, tool input sanitization, tool vulnerability scanning, code executor sandboxing); sister to unit-42-prompt-injection-observations production-telemetry piece.
  • AWS Agentic AI Security Scoping Matrix — Brown & Saner, AWS Security Blog (Nov 2025) — source summary for the AWS framework; four-scope agency/autonomy ladder + six security dimensions; load-bearing contribution is the explicit definitional split between agency (scope of permitted actions) and autonomy (degree of independent decision-making). See the framework page for the full structural analysis and crosswalk to wiki ladders.
  • Securing Agentic AI Systems — A Multilayer Security Framework — Arora & Hastings, arXiv (Dec 2025) — source summary for the MAAIS preprint; seven-layer architecture + CIAA augmentation + Design Science Research methodology + tactic-level MITRE ATLAS validation. Anchor citation for the wiki’s CIAA reference on CMM D1. Limitations: tactic-level (not technique-level) validation; no Lethal Trifecta / MCP / A2A / promptware treatment.
  • Secure Agentic AI End-to-End — Vasu Jakkal, Microsoft Security Blog (March 2026) — pre-RSAC 2026 announcement of Microsoft’s full agentic-AI security portfolio across Entra / Defender / Purview / Sentinel / Security Copilot. Three-pillar framing (secure agents / secure foundations / defend with agents). Load-bearing announcements: Agent 365 GA May 1; Microsoft 365 E7 Frontier Suite SKU; Entra Internet Access Prompt Injection Protection (first major-vendor network-layer PI containment); Defender Predictive Shielding; updated ZT4AI reference architecture.
  • Agentic SDLC in Practice — The Rise of Autonomous Software Delivery (PwC Middle East, 2026) — 82-page advisory report; 377-respondent survey across GCC+Jordan+Egypt; introduces the PwC Stage-Coverage Tiers maturity model (Observer/Experimenter/Integrator/Pioneer) and the forward Agentic SDLC operating model; cites METR 2025 RCT (16 experienced devs 19% slower with AI) as counter-evidence to vendor productivity claims; security is #1 barrier (37.7%); 84% report moderate-to-significant productivity and quality gains; 38% are Pioneers augmenting ≥6 of 7 SDLC stages; 75% plan to raise GenAI spend within 24 months; forecasts 2027 majority-Pioneer adoption.
  • 2026 Agentic Coding Trends Report (Anthropic, early 2026) — 17-page vendor strategic forecast; eight trends across Foundation/Capability/Impact buckets; Trend 8 (“agentic coding improves security defenses — but also offensive uses”); Trend 4 establishes the collaboration paradox (60% AI usage / 0-20% fully delegated); Priority 4 calls for “embedding security architecture as a part of agentic system design from the earliest stages”; customer examples (Augment Code, Fountain, Rakuten, CRED, Legora, Cowork, TELUS, Zapier) anchor concrete adoption metrics.
  • Introducing CodeMender — AI agent for code security (Google DeepMind, Oct 2025) — Google’s patching-side agent paired with Big Sleep; Gemini Deep Think reasoner + program-analysis toolbox (static + dynamic + diff testing + fuzzing + SMT solvers) + multi-agent critique/judge validation; 72 patches upstreamed to OSS in 6 months; libwebp -fbounds-safety annotation example renders entire vulnerability classes “unexploitable forever”; all patches human-reviewed.
  • From Naptime to Big Sleep (Google Project Zero, Oct 2024) — foundational paper for Google’s variant-analysis AI vulnerability-discovery agent; first AI-discovered real-world exploitable memory-safety bug (SQLite stack buffer underflow); Project Zero + DeepMind collaboration; predecessor to Big Sleep was Naptime which achieved SOTA on Meta’s CyberSecEval2.
  • Project Glasswing — Securing Critical Software for the AI Era (Anthropic, May 2026) — coalition-organizing announcement; 12 named partners (AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks) plus 40+ extended organizations; 4M OSS donations; Mythos NOT planned for GA (125 per M tokens for Glasswing participants only — contradicts XBOW’s “5× Opus at GA” claim); 27-year-old OpenBSD vuln, 16-year-old FFmpeg vuln (5M fuzzer hits without detection), autonomous Linux kernel privesc chain; Mythos raw CyberGym 83.1% — identifies the unnamed #2 on MDASH’s leaderboard.
  • Defense at AI Speed — Microsoft’s MDASH (Microsoft Security Blog, May 2026) — defender-side companion to the XBOW/Mythos ingest; multi-model agentic scanning harness orchestrating 100+ specialized agents (Prepare/Scan/Validate/Dedup/Prove pipeline); 16 new CVEs in May 2026 Patch Tuesday; 88.45% on CyberGym public leaderboard (top score, ~5 points above #2); convergent architectural argument with XBOW: “the harness does the work, the model is one input.”
  • Mythos for Offensive Security — XBOW’s Evaluation (XBOW Blog, May 2026) — first sourced anchor on the ai-vuln-discovery axis; third-party evaluation of Anthropic’s preview-stage Mythos frontier model; 42-55% false-negative reduction vs Opus 4.6 on XBOW’s web exploit benchmark; canonical source for XBOW and Mythos entity pages.
  • Secure AI Framework Approach — Implementation Guide (Google, 2024) — practitioner companion to SAIF; four-step adoption methodology (Understand the use → Assemble the team → AI primer → Apply six elements) and the six core elements (“expand foundations / extend detection / automate / harmonize / adapt + feedback / contextualize risks”). Surfaces the AI shared-responsibility model (developer / deployer / user splits), the six decision domains for data governance (quality / security / architecture / metadata / lifecycle / storage), and the cross-functional SAIF stakeholder list at the implementation-team level (vs the Layered Council at the governance level).
  • Microsoft SDL: Evolving Security Practices for an AI-Powered World — Zunger, Microsoft Security Blog (Feb 2026) — announces the explicit AI extension of Microsoft SDL; six SDL-for-AI focus areas mapping cleanly to six wiki CMM domains; six operating pillars including Cross-functional collaboration (Business Process + Application UX explicitly in scope); strategic preamble with substantive per-area technical guidance promised in follow-up posts; concrete vendor instance of the anchor + AI-overlay pattern recommended by the 2026 framework-stack thesis; “SDL is a way of working, not a checklist.”

Playbooks

See Playbooks Index.

  • Assessor’s Quick Scorecard — Secure-SDLC and AI Practices for a Large Canadian Bank — ~10-page 2nd-party advisor instrument with ~65 questions across six sections (Secure-SDLC foundation / AI governance and model risk / Frontier-AI in CI/CD optional / Continuous pentesting and AI red teaming / Identity, least-agency, and supply chain / Observability and AI IR); regulatory anchors to OSFI B-13 (Tech & Cyber Risk Mgmt), OSFI E-23 (Model Risk 2027), OSFI B-10 (Third-Party), PIPEDA §10.1, and Canada’s Voluntary AI Code; scoring rubric Yes/Partial/No/N/A with L1-L5 section maturity tiers; engagement-tier = minimum-of-sections; findings priority pinned to regulatory and AI-safety-critical exposure

Concepts

See Concepts Index.

  • Jason’s Mental Model — Security Architecture functional breakdown
  • LLM-as-a-Judge — stub
  • Evidence Centered Benchmark Design — stub
  • CyberGym Benchmark — public benchmark of 1,507 real-world vulnerability reproduction tasks across 188 OSS-Fuzz projects; load-bearing third-party leaderboard for AI-driven vuln discovery; MDASH leads at 88.45%
  • 0-20%) — Anthropic Societal Impacts finding: developers use AI in ~60% of work but can “fully delegate” only 0-20% of tasks; the implied 40-60% active-collaboration band is the largest single category of agentic AI usage; load-bearing argument for HITL as default operating mode, not just for irreversible actions
  • Vibe Coding — Karpathy-coined (Feb 2025) term for generating/modifying code via natural-language intent rather than exact specifications; now a formal advisory category (PwC names “Vibe-coder” as an emerging role); operates in tension with the collaboration paradox — most useful for throw-away prototypes, bounded by Plan-Validate-Execute for production work
  • METR 2025 RCT — load-bearing counter-evidence anchor for AI productivity claims: randomized controlled trial with 16 experienced OSS maintainers on their own repositories; enabling early-2025 AI tools made them ~19% slower; bounds vendor productivity claims at the experimental level; cited as Indicator 14 in PwC’s 2026 Agentic SDLC report
  • Non-Human Identity (NHI) — machine credentials for AI agents; lifecycle, DSPM analogy, Credential Zero
  • MCP Security — securing Model Context Protocol servers, proxies, AATO attack class
  • Human-in-the-Loop (HITL) for Agentic AI — platform-enforced confirmation gate before high-impact agent actions; OWASP four-tier model (auto/notify/confirm/block); CSA ATF gates; Stripe’s residual ASR data
  • Tool Poisoning and Rug-Pull Attacks — manipulation of tool definitions or post-trust replacement of tools; MCP CVE-prone surface; defended via fingerprinting, supply-chain scanning, confirmation gates
  • Memory Poisoning (Agentic AI) — adversarial content injection into RAG, episodic memory, scratchpad, or state checkpoints; persistent attack surface (vs single-step prompt injection); defended via provenance, retrieval filtering, integrity monitoring, state rollback
  • SPIRE — workload-identity standard (stub)
  • Least Agency Principle — OWASP-sourced; autonomy as a security dimension alongside access
  • Lethal Trifecta — Willison’s structural test: private-data + untrusted-content + external-comms in one agent
  • Lethal Bifecta — Bullen-coined write-side analogue (untrusted-content + sensitive-action); the threat model behind tool-annotation review schemas
  • CaMeL Pattern (Compartmentalized Machine Learning) — Google DeepMind (March 2025); privileged + quarantined LLM split; structured output channel prevents injected content from reaching the high-trust model; research-stage (no shipped production implementation as of May 2026); CMM D3/D4 L5 credit
  • Indirect Prompt Injection — payload arrives via agent’s own retrieval; user never sees it
  • Tool-Abuse Chains — single injection cascades into multi-tool exploitation
  • Canary Tokens for LLMs — system-prompt leak detection trip-wire
  • Three Retrieval Paths — vector / full-text / metadata; paths 2 and 3 are the practical risk
  • Shadow Automation — agent-era equivalent of shadow IT; ungoverned agents accessing repos / prod / credentials at developer pace
  • Decision Rights for AI Agents — governance counterpart to least privilege; documented authority + approver + justification + time bound per action class
  • Identity-Credential Coupling — NHIs where the credential string IS the identity (SAS tokens, storage keys, PATs); rotation = identity rotation; credential proxy can’t separate what’s structurally inseparable
  • Inference Exposure (and Retrieval Exposure) — paired AI-specific failure modes that bypass file/network access controls
  • AI Usage Control (AI-UC) — UCON Authorizations + Obligations + Conditions evaluated at answer time; the layer beyond access control
  • Shadow AI — unauthorized AI tools at work; 75% knowledge-worker GenAI usage, 78% BYOAI; Samsung incident is canonical
  • Oversight Layer (PDP + PEP for Agentic AI)architectural primary for the AI security supervision role; PDP / PEP / PIP / PAP zero-trust primitives; cross-walk to all alternative terms
  • Guardian Agent — procurement-language synonym for the oversight layer (Gartner, Feb 2026)
  • Sentinels and Operatives — runtime architectural split: Sentinels provide context (PIPs); Operatives intervene (PDP+PEP)
  • AI Agent Catalog — mandatory inventory primitive (PIP); subject inventory for the oversight layer
  • AI Agent Management Platform (AMP) — Gartner-defined vendor category for unified agent management
  • AI Agent Layered Council — Gartner-coined cross-functional governance body for agentic AI: CIO co-leads with CFO/COO/GC/Procurement/CHRO, each with a named play
  • Human Parity Line — Gartner’s measurement: AI’s blind-preference parity with industry professionals across 1,320 tasks / 42 roles; crossed in Dec 2025
  • Agentic AI Threat Classes — 2026 Expansion — five threat classes the wiki’s existing taxonomy underdevelops (insider-with-AI-access, long-running APT, agent collusion, model-version regression, jurisdictional adversaries); closes peer-review-readiness §5
  • Capability-Based Authorization — 60-year lineage (Dennis & Van Horn 1966 → Macaroons → UCAN/Biscuits → CaMeL → Tenuo Warrant); artifact-carries-policy model; structural answer to the AI Confused Deputy problem
  • Tenuo Warrant — six-property signed capability artifact; W₁⊆W₀ monotonic attenuation; constraint language (basic logic / regex / glob / CEL); Map vs Territory lesson (4 CVEs); ≈55μs E2E auth; 4 deployment modes
  • Monotonic Attenuation — delegation invariant: child capability ⊆ parent capability; contrapositive is the key safety property; prior art in Macaroons, UCAN, Biscuits, Tenuo
  • Ambient vs Derived Authority — structural distinction: identity-based auth (full credentials at deploy-time, ambient) vs capability-based auth (task-scoped artifact, derived); AI Confused Deputy as the failure mode of ambient authority with agent LLMs
  • Prompt as Code — Lidzborski’s structural framing: every input token is a potential instruction; LLMs lack an NX-bit equivalent for the prompt window; explains why filtering loses
  • Agency Gap — non-deterministic disconnect between user intent and autonomous AI execution; “wrong John” failure mode; structural reason for HITL confirm tier
  • Orchestration Hijacking — compromised LLM-as-planner manipulated by indirect injection; supports time-delayed and dormant triggers
  • Recursive Prompt Injection — LLM-as-a-judge subject to same vulnerability as primary; semantic gaslighting attack
  • Sentinel Tokens — prompt delimitation primitive; partial structural mitigation paired with adversarial fine-tuning
  • Inline Gateway vs Runtime Instrumentation — architectural fork in the agentic-AI-security seed cohort (gateway camp vs runtime-instrumentation camp); mirrors the API-gateway-vs-APM split from the cloud-native era
  • Glass-Box Security — Hurd-coined; defending AI agents from inside the model’s forward pass via mechanistic interpretability instead of black-box input/output filtering
  • Mechanistic Interpretability for Defense — applied research direction: cosine similarity against known-concept directions + scalar projection for strength; YARA-style semantic tripwires at the residual-stream level
  • Agent Memory Isolation — McMillin-derived; per-agent memory namespace keyed at the MCP server layer (outside LLM influence) to prevent cross-agent contamination through shared retrieval pools
  • Context-Aware Trimming — McMillin-derived; tagging + pinning security-relevant events as exempt from context-window eviction so an attacker can’t burn through tokens to drop the audit trail
  • Promptware — Rehberger-coined; structured multi-stage NL malware with persistence, C2, exfiltration, lateral movement — prompt injection is no longer a single-turn attack class
  • Delayed Tool Invocation — Rehberger-disclosed bypass; defers tool activation to a later conversation turn to evade platform-level confirmation gates
  • Behavioral Anomaly Detection for Agents — three-level ensemble (per-agent / per-tenant / global) profiling pattern; the structural alternative to LLM-as-judge for production-scale alert reduction
  • Prompt-Volume-to-Alert Ratio — Rittinghouse-derived; production metric for AI-SOC tuning (Agentforce: 1.8M prompts → ≤30 alerts/day at 55K tenants); the AI-era counterpart to the SIEM signal-to-noise ratio
  • Agent Commander Prompt (C2) — Rittinghouse-named; attacker-controlled prompt that issues commands to a compromised agent across sessions; AI-era analog of botnet C2
  • Differential Privacy — ε-DP and DP-SGD as defensive primitives against model inversion and membership inference; canonical mathematical privacy guarantee for ML training and federated agent telemetry; CMM D6 L4/L5 control surface
  • Model-Layer Attacks (Extraction, Inversion, Membership Inference) — three named attack classes targeting the deployed model rather than orchestration; MITRE ATLAS techniques AML.T0024 / T0044 / T0048; defenses span DP, rate limiting, output randomization, query-pattern monitoring; the threat surface the wiki had under-treated relative to prompt-injection-class threats
  • Agent Availability Threats (Runaway, Recursion, Resource Exhaustion) — the Availability axis surfaced by the MAAIS CIAA augmentation; runtime budgets, recursion-depth limits, distributed cycle detection; the Lethal Trifecta is C+I, this is the parallel A treatment
  • Operational XAI for Action Gating — distinct from mechanistic interpretability; runtime-emitted justifications gating high-impact actions before they execute; LLM-as-judge / Plan-Validate-Execute / ToolAnnotations justifications as concrete patterns
  • Network-Layer Prompt Injection Containment — third architectural layer in the prompt-injection-containment stack (Layer 0: Network) below input-detection (Layer 1) and execution-containment (Layer 2); first shipped at major-vendor scale by Microsoft Entra Internet Access PI Protection (Mar 2026); operates outside the agent’s process boundary so applies even to compromised / shadow agents

Entities

See Entities Index.

Organizations

  • Adobe — stub
  • Anthropic — AI lab; Claude; CoSAI Premier Sponsor; Glasswing publisher (stub)
  • CoSAI (org) — Coalition for Secure AI consortium
  • CyberArk — Identity Security Platform vendor (PAM, Conjur secrets, Secure AI Agents); acquired by Palo Alto Networks in 2026
  • Cloud Security Alliance — publisher of MAESTRO and Agentic AI Red Teaming Guide
  • Gartner — analyst firm; defined AI TRiSM and Guardian Agents categories
  • Glasswing — Anthropic vulnerability research / disclosure program
  • Google — hyperscaler; SAIF, A2A, Google ADK; Anton Chuvakin
  • Insight Partners — VC firm; published three-part agentic-AI security market map series
  • ISO — standards body publishing ISO/IEC 42001
  • Meta — stub
  • Microsoft — hyperscaler; Entra Agent ID, Agent 365, Defender, Purview, Prompt Shields, RAI, ZT4AI, FIDES, PyRIT (stub)
  • OSFI — Office of the Superintendent of Financial Institutions (Canada) — Canadian federal regulator and supervisor of federally-regulated financial institutions; issuer of Guidelines B-13 (Technology and Cyber Risk Management) and E-23 (2027) (Model Risk Management); stub
  • AWS (Amazon Web Services) — hyperscaler; Cedar (PDP), Firecracker (sandbox), Nitro Enclaves (TEE), Bedrock, Q (stub)
  • NVIDIA — AI infrastructure vendor; Garak, NeMo Guardrails, NeMo Jailbreak NIM (stub)
  • CrowdStrike — endpoint detection / SIEM; Falcon AIDR cited at D7 L5 (stub)
  • Datadog — observability / APM; AI Monitoring + LLM Observability cited at D9 (stub)
  • Zenity — agent-governance for M365 / Copilot Studio / Power Platform (stub)
  • MITRE — research org publishing ATLAS adversarial-ML taxonomy
  • NIST — U.S. standards body publishing AI RMF and the 600-series profiles
  • OpenAI — AI lab; CoSAI member (stub)
  • OWASP — open-source security community; publisher of LLM Top 10, Agentic AI Top 10, AIVSS
  • Palo Alto Networks — stub
  • Stripe — Lethal Trifecta containment architecture; two [un]prompted talks ingested (Bullen containment, Zhang & Shah quality)
  • Wiz — stub
  • Dropbox — Brooks McMillin’s home-lab fleet talk source ([un]prompted March 2026); stub
  • Salesforce — Agentforce platform; 55K-tenant scale telemetry from the Rittinghouse [un]prompted talk
  • Sondera — coding-agent ABAC harness producer; Cedar policy reference monitor for coding agents; Maisel’s [un]prompted March 2026 talk
  • Starseer — mechanistic-interpretability-for-defense research org (Carl Hurd’s affiliation at [un]prompted March 2026)
  • Knostic — AI security vendor; knowledge-layer governance + coding-agent governance; producer of kirin
  • Oasis Security — identity security vendor specializing in NHI management at enterprise scale
  • Onyx Security — AI Control Plane vendor; producer of the Onyx Platform; Guardian Agent vendor category
  • Glean — enterprise AI search vendor; canonical example of oversharing risk in M365/SaaS-connector retrieval (stub)
  • Tenuo — Rust capability-warrant runtime (OSS); founded by Niki Aimable Niyikiza; 4 deployment modes (in-process / sidecar / gateway / MCP-proxy); 90%→0% multi-agent ASR on custom harness; 53/53 violations on 5,700 fuzz probes
  • Snap — stub; tracked because Niki Aimable Niyikiza is Security Engineer there (Tenuo’s warrant is Tenuo’s primitive, not Snap’s)
  • Apollo Research — UK AI-safety eval org; primary source for agent-agent steganographic collusion threat modeling
  • UK AI Safety Institute (AISI) — UK government pre-deployment evaluator; cyber-task autonomy benchmarks; Frontier AI Trends Report
  • CSET (Center for Security and Emerging Technology, Georgetown) — primary source for AI export controls, jurisdictional adversaries, regulatory leverage
  • Stanford HAI — academic-grade AI Index Report (annual, methodology-disclosed, cross-year comparable); primary citation for AI adoption-rate triangulation
  • METR — independent eval org; methodological foundation for long-task autonomy claims (the 7-month doubling underneath UK AISI’s 8-month cyber figure)
  • World Economic Forum — Global Cybersecurity Outlook annual; senior-leader survey-grade data on AI risk
  • ENISA — EU government cybersecurity body; Threat Landscape annual; non-US triangulation for breach-cost claims

Seed-stage agentic AI security startups (2025–2026 funding wave; see funding synthesis):

  • Project Glasswing — coalition initiative led by Anthropic (May 2026); 12 named launch partners (AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks) plus 40+ extended organizations; 4M OSS donations; 90-day public reporting commitment; the organizing anchor for the ai-vuln-discovery and ai-in-sec-defense axes
  • Lumia Security — $18M seed Dec 2025 (Team8); Guardian-Agent class; ex-PerimeterX + Unit 8200; Adm. Mike Rogers advisory
  • Trent AI — $13M seed Apr 2026 (LocalGlobe + CIC); London-based; multi-coordinated-agents lifecycle platform; ex-AWS team
  • Runlayer — $11M seed Nov 2025 (Khosla + Felicis); MCP gateway; David Soria Parra (MCP creator) as advisor; 8 unicorn customers in 4 months
  • General Analysis — $10M seed Apr 2026 (Altos + Menlo + YC); adversarial-testing/CART for agentic AI; same Havaei/Liu/Li team as the Claude→Stripe coupons exploit research
  • Helmet Security — $9M seed Dec 2025 (SYN + WhiteRabbit); MCP discovery + monitoring + control; Fred Kneip (CyberGRX founder)
  • Keycard30M Series A Oct 2025 (a16z + boldstart, then Acrew); identity for AI agents; ex-Manifold/Snyk + Auth0/Passport.js
  • Capsule Security — $7M seed Apr 2026 (Lama + Forgepoint); runtime trust layer with explicit no proxy / no SDK positioning; ClawGuard OSS
  • SplxAI — $7M seed Mar 2025 (LAUNCHub); CART-style red-teaming; acquired by Zscaler — first exit in the cohort
  • XBOW — autonomous offensive-security platform; multi-model orchestration (Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT 5.5, preview-stage Mythos); live-site web exploitation harness; canonical entry on the ai-in-sec-offense axis (May 2026)

Products

  • AgentCordon — self-hostable open-source Agentic IDP and credential broker (Rust, GPL-3.0); three-tier CLI/broker/server split; Cedar PDP + AES-256-GCM vault + MCP gateway + OAuth2 AS in one binary; OSS alternative to Conjur for self-hosted deployments
  • AgentGateway — open-source agent / MCP gateway (stub)
  • AutoGen — Microsoft OSS Python multi-agent framework (MIT); Swarm/AgentChat/Magentic patterns; one of two frameworks Unit 42 used to demonstrate framework-agnostic agentic vulnerabilities (May 2025)
  • CrewAI — OSS Python multi-agent framework (MIT); role/goal/task model with hierarchical delegation; one of two frameworks Unit 42 used to demonstrate framework-agnostic agentic vulnerabilities (May 2025)
  • Cedar — AWS OSS policy language (Apache 2.0, 2023); formal semantics + deny-by-default; Rust engine (sub-ms evaluation); primary reference implementation for the RA Control Plane PDP alongside OPA; AI governance tooling release March 2026
  • Firecracker — AWS OSS MicroVM (Apache 2.0); KVM-backed hardware-level isolation; 125ms boot; primary reference for per-task agent sandboxing in the RA Runtime Plane; production-proven in AWS Lambda + Fargate
  • Garak — NVIDIA’s open-source LLM vulnerability scanner; ~18+ probe categories; CMM D7 L4 probe library slot
  • gVisor — Google open-source container sandbox (Apache 2.0); user-space kernel interposer via Sentry; OCI-native runsc runtime; Runtime plane sandbox alternative to Firecracker for Kubernetes deployments
  • Kirin — Knostic’s coding-agent runtime security / governance enforcement product (Cursor, Copilot, IDE extensions)
  • Lakera Guard — commercial AI security API; real-time prompt injection + jailbreak + PII detection; continuously updated from Gandalf attack intelligence; Runtime plane content safety
  • LlamaFirewall — open-source AI guardrail framework (stub)
  • MDASH — Microsoft Multi-Model Agentic Scanning Harness — defender-side agentic vulnerability discovery and remediation system; 100+ specialized agents (auditors / debaters / dedup / provers) in a five-stage pipeline (Prepare → Scan → Validate → Dedup → Prove); ensemble of frontier and distilled models with second-SOTA-counterpoint design; top of CyberGym leaderboard at 88.45%; limited private preview (May 2026)
  • Microsoft Security Copilot — AI-augmented SOC product; M365 E5/E7; five Microsoft-built role-specialized agents (Security Analyst, Alert Triage, Conditional Access Optimization, Data Security Posture, Data Security Triage) + 15 Security Store partner agents; canonical defender-AI entry on the ai-in-sec-defense axis
  • Microsoft Entra Agent ID + Agent 365 Registry — Microsoft per-agent identity and lifecycle management; GA May 1, 2026; primary enterprise COTS for M365/Azure organizations; Identity plane agent-lifecycle + action-to-identity tracing
  • Miggo Security — ADR vendor extending to agentic AI; DeepTracing (runtime AI-BOM), behavioral drift detection, proof-of-guardrail attestation with AWS Nitro Enclaves
  • Mindgard CART — commercial 24/7 Continuous Automated Red Teaming SaaS; CMM D7 L4 continuous CART slot; UK-spinout backed by .406 Ventures
  • Big Sleep — Google Project Zero + DeepMind agent for variant-analysis vulnerability discovery; Project Naptime → Big Sleep lineage (June 2024 → Oct 2024); first AI to discover an exploitable memory-safety bug in real-world widely-used software (SQLite); July 2025 CVE-2025-6965 cited as first AI-foiled in-the-wild exploit; named in Glasswing as Google’s parallel AI-cyber tool
  • CodeMender — Google DeepMind AI agent for code-security patching (counterpart to Big Sleep’s discovery); Gemini Deep Think reasoner + program-analysis toolbox + multi-agent critique/judge validation; 72 OSS security patches upstreamed in 6 months; libwebp -fbounds-safety annotations as proactive-rewrite example; all patches human-reviewed
  • Mythos — Anthropic preview-stage frontier model (~5× Opus pricing at GA); strongest sourced advance on source-code-driven vulnerability discovery (42–55% FN reduction vs Opus 4.6 per XBOW’s eval); deployed via direct partnership (XBOW offensive, Project Glasswing defensive)
  • Okta for AI Agents — Okta’s purpose-built agent identity and lifecycle management platform; GA April 30, 2026; enterprise primary for Identity plane (agent-lifecycle + NHI governance)
  • Onyx Platform — five-surface unified AI control plane (Observability / Security / Governance / Orchestration / ROI); Onyx Guardian Agent; positioned as single-pane-of-glass alternative to Wiz + Prisma + AgentGateway
  • Rego (Open Policy Agent) — CNCF-graduated OSS policy engine (Apache 2.0); Datalog-inspired Rego language; Kubernetes-native; primary alternative to Cedar for orgs with existing OPA infrastructure (Conftest, Gatekeeper, Styra DAS ecosystem)
  • Agent Guard) — incumbent enterprise vault for credential proxying; Secure AI Agents initiative + Agent Guard for STDIO-based MCP flows; CyberArk acquired by Palo Alto Networks in 2026
  • Wiz AI-SPM — Wiz CNAPP’s AI Security Posture Management module; first-CNAPP-AI-SPM (2024); graph-based attack-path correlation; runtime AI agent monitoring (added 2025); part of Google Cloud Security
  • Palo Alto Prisma AIRS — Palo Alto’s end-to-end AI security platform (runtime + posture + model security + red teaming); GA April 2025; 2.0 GA October 2025 with Protect AI integration
  • Promptfoo — open-source LLM evaluation + red-teaming framework (now part of OpenAI); CI-gated regression suite; CMM D7 L4 regression slot; primary empirical source for model-version-degradation finding
  • PyRIT — Microsoft AI Red Team’s open-source orchestration framework; multi-turn attack strategies (Crescendo, TAP, Skeleton Key); CMM D7 L4 orchestration slot
  • Smokescreen — Stripe’s open-source SSRF / egress proxy; the network-side control point for Lethal Trifecta containment
  • Toolshed — Stripe’s internal central MCP proxy / tool registry; PEP for tool-call policy via ToolAnnotations
  • AgentDojo — peer-reviewed (NeurIPS) independent prompt-injection benchmark; Meta uses it to evaluate LlamaFirewall PromptGuard 2; the third-party comparator for vendor self-eval claims
  • Salesforce Agentforce — Salesforce’s enterprise agent platform; production source for the Rittinghouse 1.8M-prompts / 30-alerts telemetry case study

People

  • Aaron Brown — co-author (with Matt Saner) of the AWS Agentic AI Security Scoping Matrix (Nov 2025); stub
  • Andrew Bullen — Head of AI Security at Stripe; coined the “Lethal Bifecta”; presented Stripe’s containment architecture at [un]prompted March 2026
  • Avivah Litan — Distinguished VP Analyst at Gartner; primary author of the Guardian Agents Market Guide (Feb 2026)
  • Ben Nassi — security researcher; lead author on “Invitation Is All You Need” Google Calendar / Gemini indirect-injection demonstration (cited in Lidzborski’s Workspace talk)
  • Bill McIntyre — author of Securing Your Agents deck (AIE / RMAIIG, 2026)
  • Bob Rudis — VP Data Science, GreyNoise Labs
  • Brandon Gummer — Gartner VP Analyst; co-presenter of Scaling Agentic AI: A Leadership Guide for CIOs (May 2026); name spelling unverified
  • Brooks McMillin — Dropbox engineer; presented home-lab 19-agent / 73-tool fleet study at [un]prompted March 2026; introduced agent-memory-isolation and context-aware-trimming patterns
  • Carl Hurd — Starseer researcher; presented Glass-Box Security (mechanistic interpretability for defense) at [un]prompted March 2026
  • Daniel Miessler — stub
  • Daryl Plummer — Distinguished VP Analyst at Gartner; co-author of the Guardian Agents Market Guide
  • Remy Gulzar — Gartner VP Analyst; co-presenter of Scaling Agentic AI: A Leadership Guide for CIOs (May 2026); name spelling unverified
  • Dongdong Sun — Senior Staff ML Engineer, Palo Alto Networks
  • Jeffrey Zhang — Stripe AI security engineer; co-presenter of Guardrails Beyond Vibes at [un]prompted March 2026 (with Sid Shah)
  • Johann Rehberger — independent red-team researcher; “Month of AI Bugs” (Aug 2025); Embrace The Red; presented “Your Agent Works for Me Now” at [un]prompted March 2026 (introduced promptware and delayed-tool-invocation)
  • John Hastings — co-author (with Sunil Arora) of the MAAIS arXiv preprint (Dec 2025); affiliations unverified; stub
  • Matt Maisel — Sondera engineer; presented “Hooking Coding Agents with Cedar” at [un]prompted March 2026 (deterministic reference monitor for coding agents)
  • Matt Rittinghouse — Salesforce engineer; co-presenter of 1.8M Prompts, 30 Alerts at [un]prompted March 2026
  • Matt Saner — co-author (with Aaron Brown) of the AWS Agentic AI Security Scoping Matrix (Nov 2025); stub
  • Millie Rittinghouse — Salesforce engineer; co-presenter of 1.8M Prompts, 30 Alerts at [un]prompted March 2026
  • Mohamed Nabeel — Sr Principal Researcher, Palo Alto Networks
  • Nicolas Lidzborski — Principal Software Engineer, Google Workspace security; ~3 years on GenAI security; speaker at [un]prompted March 2026
  • Niki Aimable Niyikiza — Founder @ Tenuo; Security Engineer @ Snap; ~10 yrs infra security (Google, Datadog, Snap); presenter at [un]prompted March 2026; coined the valet-key / Map vs Territory framings for capability-based authorization
  • Peter Smith — Director, Agentic SOC Product Management at Salesforce; co-presenter of Beyond the Chatbot at [un]prompted March 2026 (with Ravi Kiran Sharma)
  • Ravi Kiran Sharma (RK) — Lead Security Engineer at Salesforce; co-presenter of Beyond the Chatbot at [un]prompted March 2026 (with Peter Smith)
  • Sid Shah — Stripe AI security engineer; co-presenter of Guardrails Beyond Vibes at [un]prompted March 2026 (with Jeffrey Zhang)
  • Simon Willison — independent researcher; coined the Lethal Trifecta (Jun 2025); simonwillison.net
  • Sounil Yu — author of The Cyber Defense Matrix (Wiley, 2022); collaborator with Knostic on AI-era extensions of CDM (stub)
  • Sunil Arora — co-author (with John Hastings) of the MAAIS arXiv preprint (Dec 2025); affiliations unverified; stub
  • Taesoo Kim — lead author of Microsoft’s MDASH announcement (May 2026); represents/leads Microsoft’s Autonomous Code Security (ACS) team; prior systems-security and DARPA AI Cyber Challenge background (unverified)
  • Vasu Jakkal — Corporate Vice President, Microsoft Security; lead byline on Microsoft Security Blog announcement posts (incl. Secure Agentic AI End-to-End, March 2026)
  • Yonatan Zunger — Microsoft Security leader; bylined author of the 2026-02-03 Microsoft SDL AI-extension announcement; stub (bio pending primary-source confirmation)
  • Apostol Vassilev — NIST Computer Security Division researcher; co-author of SP 800-218A and lead author of NIST AI 100-2e2023 Adversarial Machine Learning: A Taxonomy and Terminology (the federal-anchor adversarial-ML reference cited throughout the wiki); stub

Thesis Pages

See Thesis Index.

  • Security Controls for AI Stacks — six-layer inventory mapping existing controls to layers; flags gaps (egress, AI-BOM, model-layer)
  • Secure-SDLC Framework Stack for 2026 — Is NIST SSDF + OWASP SAMM Enough? — evaluates the “anchor policy to SSDF, assess maturity via SAMM” claim; structurally correct foundation, materially incomplete for AI-augmented threat model and AI-component governance; recommended 6-layer stack adds AI overlay (NIST AI RMF / Agentic AI Security CMM / ISO 42001), supply-chain layer (SLSA + CycloneDX), benchmark (BSIMM), and operational alignment (NIST CSF 2.0)

Incidents

See Incidents Index.

Gaps and Open Questions

See Gaps Index.

Comparisons

See Comparisons Index.

  • Cybersecurity CMM Exemplars — design lessons from CMMI / BSIMM / OWASP SAMM / CMMC 2.0 / NIST CSF 2.0 for any new AI-security CMM
  • [[unprompted-march-2026-talks-vs-ra-cmm|[un]prompted March 2026 Talks vs RA + CMM]] — Tier 1/2/3 relevance ranking of [un]prompted (March 3–4, 2026) conference talks against the RA planes and CMM domains
  • Wiki Novelty and Counter-Arguments — 2026 — what the wiki actually contributes vs OWASP/NIST/Gartner/CSA + per-thesis competing-view callouts (platform-vs-prompt, Gartner 50% elimination, Lethal Trifecta unconditional, floor rule, eval-harness absorber, UEBA-for-agents); 10 unresolved contests logged; closes peer-review-readiness §7
  • Agentic AI Security Seed Funding — May 2025 to May 2026 — top 8 seed rounds (~$85M total) ranked, mapped to RA planes + CMM domains; gateway-vs-runtime-instrumentation architectural fault line; D6 absence and CART-consolidation pattern; commercial reference-implementation deltas filed
  • ASL, and OWASP Don’t Share Axes — scoping analysis for the parked PwC/MS/Anthropic/OWASP comparison candidate; documents the three-categorical-groups finding (org-program maturity vs model-capability tier vs threat-coverage catalog) and why a single-axis comparison would be misleading

Folds

  • Fold k4 — 2026-05-04 to 2026-05-07 — n16 — extractive rollup of 16 log entries; dominant themes: standards-anchoring discipline, CMM consolidation toward a single canonical home, lint-driven backfill paired with tooling tightening.

Recent Activity