Offensive AI: State of the Field
Question
How has agentic AI changed offensive operations in 2026, and what does the kill chain look like when AI is the abstraction layer attackers operate on? Specifically: which kill-chain phases (recon, initial access, persistence, C2, lateral movement, exfiltration) now have demonstrated AI-assisted or AI-driven techniques? Which tools (commercial and OSS) are in operator use? Where are the trust and capability boundaries between human-operator-with-AI-assistant and fully-autonomous offensive agents?
Current Position
The 2026 offensive-AI surface is dominated by promptware as the unit of capability — multi-stage operations written in natural language, executed across heterogeneous agentic substrates (Promptware). The kill chain has migrated up a layer: instead of crafting binaries that execute on a target operating system, operators craft prompts that execute across an agent’s tool surface. [[your-agent-works-for-me-now-rehberger-talk|Rehberger’s [un]prompted 2026 talk]] demonstrated the kill chain across production systems — Gemini Workspace, Microsoft Enterprise Copilot, ChatGPT, OpenClaw, KimiCloud — establishing that this is not a research-curiosity but an operational reality.
Three load-bearing patterns characterize the field:
- Prompt-level C2 — Agent Commander demonstrates command-and-control infrastructure built on prompt abstractions: enrollment, exfiltration, arbitrary dispatch. The C2 surface is now the agent itself.
- Tool-set manipulation — tool poisoning and delayed tool invocation are the AI-era equivalents of supply-chain and timing attacks; both are vendor-disclosed and demonstrated.
- Continuous adversarial testing as offense — General Analysis ($10M seed, April 2026) is productizing CART (Continuous Adversarial Red Team) for agentic AI; the same orchestration applies symmetrically to attacker tradecraft.
Supporting Evidence
- Claude+Stripe coupons exploit (July 2025) is the canonical multi-MCP context-pollution case study — demonstrates that production AI app stacks have exploit surfaces that classical SAST/DAST will not find.
- Month of AI Bugs coordinated-disclosure series documents the breadth of model-and-app vulnerability across frontier vendors.
- Red Teaming Capability Framework maps OWASP LLM Top 10, OWASP Agentic AI Top 10, MITRE ATLAS, and CSA MAESTRO to a layered red-team practice — Tier 1–5 from “standards aware” to “vendor evaluation lead.”
Counter-Evidence
Update 2026-05-13: XBOW now has a dedicated wiki page, sourced from XBOW’s Mythos Evaluation (May 2026). XBOW operates as a multi-model orchestration layer (Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT 5.5, plus preview-stage Mythos) that converts frontier-model vulnerability candidates into validated exploits via live-site interaction harnesses. Quantitatively material: 42–55% false-negative reduction vs Opus 4.6 on XBOW’s web exploit benchmark when paired with Mythos. XBOW fits cleanly at Tier 5 (Vendor Evaluation) of the Red Teaming Capability Framework and anchors the ai-in-sec-offense axis.
Prophet AI / Dropzone (offensive-side)
Both have offensive-adjacent capabilities under SOC framing. Need clarification of where they sit on the WITH-AI vs. FOR-AI axis.
Autonomous offensive agents — production reality?
Rehberger’s work demonstrates human-operated AI-augmented offense at scale. The line between “operator with copilot” and “autonomous offensive agent” is currently rhetorical, not technical. Need sourced examples or explicit acknowledgment that the autonomous-agent threat is forecast, not present.
How This Has Evolved
Seeded 2026-05-13. The offensive-side material is well-covered for practitioner research (Rehberger, General Analysis, the [un]prompted conference cohort) but thin on commercial tooling. The first ingest sprint will target the gap-flagged vendors.
Open Sub-Questions
- Does offensive AI deserve its own anchor artifact (a maturity model or reference architecture) or is the right address a thesis page that periodically annexes the relevant CMM domains?
- How do defender CMM levels translate to offensive-capability tiers? Is there a useful symmetry — i.e., does an enterprise at CMM L4 face L4-equivalent offensive AI? Or is the relationship asymmetric?
- See Gaps Index for related open questions.