Johann Rehberger
Independent red-team researcher widely credited with the most prolific public corpus of responsibly-disclosed prompt-injection and agent-compromise vulnerabilities. Publishes at the Embrace The Red blog. Originated and led the “Month of AI Bugs” initiative in August 2025, in which he cataloged dozens of successful attacks against every frontier model and every major agentic development kit during a single calendar month — a deliberate echo of the Month of Browser Bugs / Month of Bugs tradition from earlier-era offensive security.
At [un]prompted (March 2026) he was the headline offensive track speaker on Stage 2 and received an extended standing introduction from Gadi Evron. The talk — “Your Agent Works for Me Now” — disclosed two previously unreported attack techniques (Delayed Tool Invocation and Agent Commander) and introduced the Promptware framing.
Research Contributions
Framing Contributions
- Promptware (coined/popularized, 2026) — the reframing of agentic AI attacks from single-turn injection events to multi-stage malware with kill chain structure. Echoes Ben Nassi’s concurrent Promptware Kill Chain paper.
- “Offensive context engineers” — Rehberger’s characterization of red-teamers working in the LLM era, reflecting the shift from technical exploit craft to natural-language prompt engineering.
- Normalization of deviance — Rehberger’s application of the systems-safety concept to AI adoption: the pattern where operational familiarity with AI in test contexts breeds unwarranted trust in production contexts.
Technique Contributions
- Delayed Tool Invocation (disclosed March 2026) — bypasses platform-level tool deactivation controls by embedding a conditional trigger that fires the tool in a subsequent conversation turn; demonstrated against Gemini Workspace, Microsoft Copilot, ChatGPT, and Google Home.
- Agent Commander — Prompt-Level C2 (disclosed March 2026) — a prompt-native command-and-control tool that enrolls agents into a C2 server via natural-language promptware; demonstrated zero-click enrollment via OpenClaw’s Gmail PubSub feature; cross-platform (OpenClaw + KimiCloud).
- ASCII / Unicode tag character steganography — exploitation of invisible Unicode tag characters to embed injection payloads in files and issue tickets; first public Xcode injection demonstration.
- Spyber (prior, Black Hat) — continuous exfiltration via memory-poisoned ChatGPT agent; every user keystroke forwarded to attacker after initial enrollment.
Disclosure History (Selected)
- Jules AI compromise (Aug 2025) — full five-stage kill chain on Google’s Jules coding agent, from a hidden GitHub-issue prompt injection through persistence in project files to remote command-and-control.
- Month of AI Bugs (Aug 2025) — 30+ days of daily coordinated disclosures; all coordinated to release simultaneously; attacks documented against every frontier model and major agentic dev kit.
- Multiple GeminiJack-class attacks against Google Gemini’s tool-use surfaces.
- ChatGPT memory persistence attacks — zero-click long-term memory poisoning via shared content.
- Markdown-image and URL-fetch exfiltration patterns — documented across multiple frontier-model deployments.
- Microsoft Enterprise Copilot memory implant (disclosed Dec patch; [un]prompted 2026 case study) — injection via file summarization writes persistent attacker-controlled memories to enterprise Copilot.
- Apple Xcode prompt injection (March 2026) — first public demonstration of injection via Xcode’s AI code review + RunSnippet tool.
- Gemini + Google Home physical actuator control (March 2026) — delayed tool invocation causes Google Home speaker to play attacker-specified audio via document title as intent signal.
Relevance to the Wiki
Rehberger’s work is the primary public empirical evidence that the threat model of indirect injection → tool abuse → exfiltration → persistence → C2 is not theoretical. The Securing Your Agents deck cites his Month of AI Bugs as the proof-by-existence basis for treating agent-tool integrations as adversarial-by-default. His [un]prompted 2026 talk extends that evidence base with two new structural techniques and the promptware malware framing — moving the conversation from “injection as a class” to “injection as initial access in a multi-stage campaign.”
See Also
- [[your-agent-works-for-me-now-rehberger-talk|“Your Agent Works for Me Now” — [un]prompted March 2026]] — the primary talk page
- Month of AI Bugs (August 2025) — the August 2025 disclosure series
- Jules AI Kill Chain — Indirect Injection to Full Remote Control — the canonical case study from that series
- Promptware — the malware framing he introduced
- Delayed Tool Invocation — the tool-reactivation bypass technique
- Agent Commander — Prompt-Level Command and Control — the C2 infrastructure tool
- Simon Willison — companion public-research voice on prompt injection
- Unit 42 In-the-Wild Prompt Injection Observations — Palo Alto’s production-telemetry counterpart