Garak — Generative AI Red-teaming & Assessment Kit

NVIDIA’s open-source LLM vulnerability scanner. A probe-and-detector library that runs a fixed taxonomy of attack probes against a target model and reports per-probe pass/fail rates. The wiki’s CMM cites Garak as the “probe library” attack category in the D7 L4 four-quadrant red-team coverage requirement.

What it does

“checks if an LLM can be made to fail in a way we don’t want.” — repo README

Garak’s mental model: probes × generators × detectors.

LayerExamples
Probes (attack categories)encoding, promptinject, gcg, dan, malwaregen, xss, leakreplay, packagehallucination, snowball, misleading, donotanswer, goodside, badchars, atkgen, continuation, glitch, grandma, lmrc, realtoxicityprompts, av_spam_scanning, blank (~18+ categories; exact total varies per release)
Generators (target model APIs)HuggingFace Hub (Pipeline, Inference API, private endpoints), OpenAI, AWS Bedrock, Replicate, Cohere, Groq, NVIDIA NIM, ggml/llama.cpp, REST (custom YAML), LiteLLM, plus test generators
Detectors (judges)Each probe declares primary_detector + extended_detectors; harness wires the matching detector(s) automatically

Output: JSONL hit logs, garak.log, plus an HTML report.

How it differs from PyRIT

PropertyGarakPyRIT
Attack shapeSingle-shot, fixedMulti-turn, stateful, composable
Coverage styleBreadth (catalog of named probes)Depth (orchestration of attack sequences)
Run modeBatch scanInteractive / scripted orchestration
Best for”Run the canon against this model""Build a novel multi-turn attack”

The wiki’s CMM D7 L4 requires both — single-tool coverage is not L4.

Coverage caveats

  • docs.garak.ai/garak/probes returned 404 during research; the canonical probe inventory lives in the GitHub source tree (garak/probes/) and the docs site is “work in progress.” Wiki citations should reference the source tree, not the docs page, for an authoritative list.
  • Probe count drifts release-to-release. Do not cite a fixed number without pinning a Garak version.
  • Largely chat-completion-shaped. Agentic / tool-use / MCP attack surfaces are not natively covered.

Direct quotes

  • “checks if an LLM can be made to fail in a way we don’t want.” — github.com/NVIDIA/garak
  • “Generative AI Red-teaming & Assessment Kit” — repo README

How the wiki uses it

  • CMM D7 L4 — probe-library red-team category
  • Measurement Protocol — listed as one of four required tools at L4 (“single-tool coverage is not L4”)
  • Closes the breadth-of-known-CVE-style-probe-coverage seam: Garak gives reproducible coverage of named jailbreak/injection/leak categories that PyRIT and Promptfoo would have to re-implement.

See Also