RAPTOR

Sources: GitHub — gadievron/raptor · Mythos-ready paper §Appendix A Historical Precedent + PA 1 (Point Agents at Your Code and Pipelines) · Smithery skill listing.

What

RAPTOR — Recursive Autonomous Penetration Testing and Observation Robot — is an open-source autonomous security research framework built on top of Claude Code (v3.0.0; the analysis layer is pluggable, so other coding-agent platforms can be substituted). It chains static analysis (Semgrep + CodeQL), binary analysis + crash analysis (AFL++ fuzzing + rr deterministic debugger), LLM-powered vulnerability validation, exploit generation, and patch writing into a single workflow that runs against a codebase or a binary. MIT license (with CodeQL having its own license that does not permit commercial use).

Self-described as “not polished software. It was built in free time, held together with enthusiasm and duct tape, and it works well enough that we can’t stop using it.”

Authors

Capability Surface

RAPTOR exposes a slash-command interface inside Claude Code. Stable commands as of repository fetch (2026-05-15):

CommandWhat it does
/agenticFull autonomous workflow: scan, validate, exploit, patch
/scanStatic analysis with Semgrep and CodeQL
/understandMap attack surface, trace data flows, hunt vulnerability variants
/validateMulti-stage exploitability validation pipeline (Stages 0–F)
/codeqlCodeQL-only deep analysis with SMT dataflow pre-screening
/fuzzBinary fuzzing with AFL++ and crash analysis
/crash-analysisAutonomous root-cause analysis for C/C++ crashes
/oss-forensicsEvidence-backed forensic investigation for GitHub repositories
/projectNamed workspaces to organize runs and track findings over time

Beta commands: /exploit (PoC exploit generation), /patch (secure-patch generation). Alpha: /web (web app scanning).

The --privileged flag is required for the rr deterministic debugger inside the dev container; image is ~6 GB and starts from Microsoft’s Python 3.12 devcontainer.

Relevance to This Wiki

RAPTOR is the author-tracing closure for the wiki’s coverage of the Mythos-era vuln-discovery cluster. The same Gadi Evron who:

  • Co-introduced VulnOps (with Heather Adkins and Bruce Schneier, October 2025),
  • Lead-authored the [[mythos-ready-csa-sans-unprompted-2026-04-12|CSA / SANS / [un]prompted / OWASP Mythos-ready briefing]] (April 2026),
  • Runs Knostic (which ships OpenAnt and Kirin),

is also the lead author of RAPTOR — the open-source autonomous-security-research framework the Mythos-ready playbook recommends in PA 1 alongside OpenAnt for the Monday-morning operationalization of Point Agents at Your Code and Pipelines.

Structurally, RAPTOR is the offensive/defensive twin to OpenAnt: where OpenAnt operates a six-stage vuln-discovery pipeline with constrained-attacker-persona FP control, RAPTOR is a Claude-Code-skill-driven multi-stage framework that goes further down the pipeline (exploit generation + patch writing in addition to discovery + validation). Both are auditable open-source instruments; both are recommended by the same briefing.

The multi-stage exploitability validation pipeline (Stages 0–F) is structurally adjacent to OpenAnt’s six-stage Parse → Reachability → Classification → Discovery → Verification → Dynamic pipeline and to MDASH’s five-stage Prepare → Scan → Validate → Dedup → Prove pipeline — the architectural convergence on multi-stage validation as the FP-control discipline (see Adversarial Reflexion) extends from frontier-vendor harnesses through OSS instruments down to this Claude-Code-skill-level tool.

Adjacent / Open

  • No published benchmarks (recall, FP rate) comparable to MDASH 88.45% / Aardvark 92% / OpenAnt filter-ratio metrics. The framework is positioned as practitioner-grade rather than benchmark-validated.
  • Pluggable analysis layer — the README is explicit that RAPTOR is “not tied to” Claude Code; the abstraction surface for plugging in alternative LLM-coding-agent platforms is not documented in the README excerpt and is a follow-up question.
  • Smithery distribution — the skills are also distributed via Smithery; the relationship between the Smithery skill package and the GitHub repository (versioning, parity) is not captured here.