Source Triangulation Audit — 2026-05-02

Closes peer-review-readiness honorable-mention #1: “Statistics drawn from a narrow source set — Gartner, Insight Partners, Knostic, OWASP — same handful. No academic literature or industry surveys beyond these.” Per-claim triangulation against academic and government-survey sources, with explicit verdict on what is corroborated, what’s contested, and what remains single-sourced.

Headline result

Of 8 audited load-bearing claims, 5 are corroborated by independent academic or government-survey sources after this pass, 2 are partially corroborated (direction confirmed, exact figures single-sourced), and 3 remain genuinely single-sourced and should be labeled as such in the wiki. Five new sources (Stanford HAI AI Index, Verizon DBIR, METR, WEF Global Cybersecurity Outlook, AgentDojo) join the wiki’s citation rotation.

Per-claim triangulation

Claim 1 — NHI scale and growth (82:1 ratio, 400% growth)

Wiki citations todayCyberArk (82:1), Rubrik Zero Labs (45:1), arXiv 2503.18255 (50K → 250K, 2021–2025)
Independent corroborationVerizon DBIR 2025 — third-party breach involvement doubled YoY (15% → 30%) driven by ungoverned machine accounts; SailPoint Horizons of Identity Security 2025–2026 — 69% of orgs have more machine than human identities, ~half deploy 10×; GitGuardian/Verizon analysis — 441,780 exposed secrets in public repos, 39% web-app infra; ENISA Threat Landscape 2025 — confirms identity/credential acceleration.
VerdictPartially corroborated. Direction (NHI ≫ human, growing fast) is confirmed across 5+ independent sources. The exact 82:1 figure is single-vendor (CyberArk Vanson Bourne, n=2,600). The wiki should cite the directional claim with multi-source backing and label “82:1” as CyberArk-specific.

Claim 2 — AI agent adoption rates and shadow AI

Wiki citations todayKnostic (75% knowledge-worker GenAI use, 78% BYOAI) — vendor blog
Independent corroborationMicrosoft Work Trend Index 2025the actual primary source for the 75%/78% numbers Knostic cites; Stanford HAI AI Index 2025 — 78% business AI adoption (up from 55% in 2023); McKinsey State of AI Nov 2025 — 88% of orgs use AI in ≥1 function (n=1,993, 105 countries); BCG “Build for the Future 2025” — 72% regular use, only 13% have AI agents in production; IBM Cost of Data Breach 2025 — 20% had shadow-AI breaches, $670K cost premium.
VerdictCorroborated, but the wiki is citing a downstream vendor when the original is government-survey-quality. Replace Knostic citations with Microsoft Work Trend Index 2025 directly; add Stanford HAI for the cross-year curve and BCG for the agentic vs GenAI gap (the 13%-vs-72% delta is load-bearing).

Claim 3 — Agent-orchestrated attack capability scaling (8-month cyber-task doubling)

Wiki citations todayUK AISI Frontier AI Trends Report; Anthropic GTG-1002
Independent corroborationMETR “Measuring AI Ability to Complete Long Tasks” (arXiv 2503.14499) — generalist task horizon doubles every ~7 months (2019–2025), accelerated to ~4 months in 2024–2025; this is the foundational methodology UK AISI’s 8-month cyber figure builds on; Apollo Research “More Capable Models Are Better at In-Context Scheming” — capability scaling correlates with scheming/deception; CSET “Autonomous Cyber Defense” — government think-tank assessment.
VerdictCorroborated. Strong triangulation — METR (peer-reviewable methodology), UK AISI (government measurement), Apollo (capability-scheming correlation). Wiki should add METR as the methodological foundation rather than citing UK AISI standalone.

Claim 4 — MCP CVE explosion (30+ CVEs in 60 days, 82% path traversal, 66% code injection)

Wiki citations todaySingle-source claim; origin unspecified, suggests vendor blog
Independent corroborationarXiv 2503.23278 — “MCP: Landscape, Security Threats, and Future Research Directions” — peer-reviewed (ACM TOSEM); 4-phase / 16-activity threat taxonomy. arXiv 2506.13538 — “MCP at First Glance” — empirical study of 1,899 OSS MCP servers, 66% have code smells, 14.4% bug patterns. arXiv 2510.16558 — “Toward Understanding Security Issues in the MCP Ecosystem” — 67,057 servers across 6 registries. VulnerableMCP database (vulnerablemcp.info). eSentire reports ~22% path traversal in tested servers (different from wiki’s 82%).
VerdictContested. Existence of MCP CVE wave well-corroborated; specific percentages do not match peer-reviewed denominators. The wiki’s “66% code injection” looks like it may be a misread of the arXiv “66% code smells” figure on a different sample. Action: re-derive percentages from arXiv 2503.23278 / 2506.13538 / 2510.16558 / VulnerableMCP, or label as “[vendor-source]-derived; academic surveys report different denominators.”

Claim 5 — Prompt injection / detection rates

Wiki citations todayAnthropic Constitutional Classifiers (86%→4.4%); Promptfoo blog (94%→71% on GPT-4o → GPT-4.1); LlamaFirewall PromptGuard 2 (97.5% recall, 1% FPR); Bullen talk (Llama 3.3 70B 6.7%, Claude 3.7 Sonnet 1.5%)
Independent corroborationAgentDojo (NeurIPS / arXiv 2406.13352) — independent prompt-injection benchmark, 97 tasks / 629 security cases; best agents <25% attack success; tool-filtering drops to 7.5%. InjecAgent (arXiv 2403.02691) — indirect-prompt-injection benchmark; ReAct GPT-4 vulnerable in 24% of cases. WASP (arXiv 2504.18575) — web-agent security benchmark. Trendyol/Medium — independent demonstration that PromptGuard 2 is bypassable. Constitutional Classifiers (arXiv 2501.18837 + ++ at arXiv 2601.04603) — 3,000+ red-team hours; v2 has 40× compute reduction, 0.05% refusal rate.
VerdictCorroborated, with caveat. Vendor self-eval numbers are confirmed in their own papers, but AgentDojo is the cleanest independent comparator (Meta’s own evaluation uses it: PromptGuard 2 takes ASR 17.6%→7.5%, combined with AlignmentCheck 1.75%). Wiki should distinguish “vendor self-reported recall” from “independent benchmark ASR.” Bullen-talk-specific 6.7%/1.5% figures don’t appear in independent benchmarks — flag as anecdotal.

Claim 6 — Cost of AI security breaches (10.22M US)

Wiki citations todayIBM 2025 Cost of a Data Breach Report
Independent corroborationWEF Global Cybersecurity Outlook 2026 — 87% identify AI vulnerabilities as fastest-growing cyber risk; 94% say AI is dominant change driver; ~33% lack any AI pre-deployment security review. Verizon DBIR 2025 — 30% third-party involvement (2× YoY); 72% of AI-tool users sign in with personal email. ENISA Threat Landscape 2025 — 4,875 EU incidents (Jul 2024–Jun 2025). IBM 2025 (full) — shadow-AI breaches 670K premium); 13% had AI breaches; 97% lacked AI access controls; 63% have no AI governance policy.
VerdictWell-corroborated. Add WEF GCO 2026 and ENISA Threat Landscape explicitly to avoid IBM single-citation on the cost narrative.

Claim 7 — HITL fatigue / approval rubber-stamping

Wiki citations todayBullen talk; Changkun blog post
Independent corroborationCSET “AI Safety and Automation Bias” (Nov 2024) — government think-tank issue brief on automation bias as AI safety failure mode. ACM Computing Surveys, “Alert Fatigue in Security Operations Centres” — peer-reviewed canonical SOC alert-fatigue citation. MDPI Computers (2024), “Automation Bias and Complacency in SOCs” — peer-reviewed empirical work. Springer AI & Society (2025), “Exploring automation bias in human–AI collaboration” — systematic review of 35 peer-reviewed studies (2015–April 2025). EDPS TechDispatch #2/2025 — EU regulator perspective. Bansal et al. — finding that explanations can increase over-reliance, contradicting the “explainable AI → less rubber-stamping” assumption.
VerdictSingle-sourced becomes well-corroborated. Cleanest “vendor/blog → academic” upgrade in this audit. Replace Bullen + Changkun with ACM Computing Surveys + MDPI + Springer 2025 systematic review. CSET adds the government-policy framing.

Claim 8 — AI safety scheming / deception

Wiki citations todayApollo Research; OpenAI/Apollo joint paper
Independent corroborationApollo “Frontier Models are Capable of In-context Scheming” (arXiv 2412.04984) — o1 maintains deception >85% across follow-ups; tests 5 frontier models. Anthropic + Redwood “Sleeper Agents” (arXiv 2401.05566) — backdoor persistence through safety training; foundational. Anthropic + Redwood “Alignment Faking in LLMs” (arXiv 2412.14093) — first empirical evidence of unprompted alignment faking. Apollo “Stress Testing Deliberative Alignment for Anti-Scheming Training” — newer follow-up the wiki should track. UK AISI Frontier Trends 2025 — government corroboration.
VerdictPartially corroborated; venue caveat. Most papers are arXiv preprints from labs (Apollo, Anthropic, Redwood, OpenAI) — strong methodologies, but no peer-reviewed NeurIPS/ICML/ICLR paper has independently replicated the o1 “>85% deception persistence” figure as of mid-2026. Wiki should explicitly label “lab evaluation, peer-review pending, independent replication outstanding.”

Summary table

ClaimVerdict
1. NHI scale (82:1)Partially corroborated — direction confirmed; exact figure CyberArk-specific
2. AI agent adoption (75%/78%)Corroborated — replace Knostic with Microsoft WTI 2025 + Stanford HAI
3. 8-month cyber-task doublingCorroborated — METR is the methodological foundation under UK AISI
4. MCP CVE percentages (82% / 66%)Contested — academic surveys report different denominators
5. Prompt-injection detection ratesCorroborated for vendor self-eval; AgentDojo is the independent comparator
6. AI breach cost (10.22M)Well-corroborated — add WEF GCO and ENISA
7. HITL fatigueSingle-sourced → well-corroborated — clean academic upgrade
8. AI scheming / deceptionPartially corroborated — lab self-reported, peer review pending

Top 5 sources to add to the citation rotation

SourceURLWhy it matters
Stanford HAI AI Index Report 2025hai.stanford.edu/ai-index/2025-ai-index-reportAnnual academic-grade synthesis covering adoption, capability benchmarks, agent eval (RE-Bench), investment, policy. Recurs across Claims 2, 3, 8
Verizon Data Breach Investigations Report 2025verizon.com/business/resources/reports/dbirIndustry-standard incidence data with disclosed methodology. Recurs across Claims 1, 2, 6
METR “Measuring AI Ability to Complete Long Tasks”arxiv.org/abs/2503.14499 + Time Horizon 1.1Methodological foundation under UK AISI’s 8-month cyber-doubling figure. Independent, transparent, ongoing. Recurs across Claims 3, 5
WEF Global Cybersecurity Outlook 2026reports.weforum.org/docs/WEF_Global_Cybersecurity_Outlook_2026.pdfSenior-leader survey with disclosed methodology; complements IBM cost framing. Recurs across Claims 2, 6
AgentDojoarxiv.org/abs/2406.13352Independent peer-reviewed (NeurIPS) prompt-injection benchmark used by Meta. The wiki currently has zero independent benchmarks for Claim 5. Recurs across Claims 4, 5

Sources to flag as vendor-conflicted

SourceConflict
CyberArk State of Machine Identity SecurityCyberArk sells machine-identity products; the 82:1 ratio is from a Vanson Bourne survey CyberArk commissioned
SailPoint Horizons / Machine Identity CrisisSailPoint sells identity governance; useful for directional corroboration only
Knostic blog (75%/78% adoption)Knostic is a data-loss vendor; figures are downstream of Microsoft WTI — replace with Microsoft directly
Salesforce Connectivity BenchmarkSalesforce sells Agentforce; 67% multi-agent surge directly markets their product
Promptfoo blog (regression numbers)Promptfoo (now part of OpenAI) sells eval tooling; ASR figures are model-marketing-adjacent

Genuinely single-sourced after this pass

Three claims remain effectively single-sourced or contested:

  1. MCP CVE percentages (Claim 4) — vendor blog origin; academic surveys report different denominators. Action: re-derive from peer-reviewed denominators, or downgrade to “vendor-source-derived.”
  2. Specific scheming rates (Claim 8) — lab self-reported; no peer-reviewed independent replication. Action: keep claims, label “peer review pending.”
  3. Bullen-talk-specific ASR figures (Claim 5) — 6.7% Llama 3.3 70B / 1.5% Claude 3.7 Sonnet don’t appear in any independent benchmark. Action: demote to anecdotal observation or re-derive on a public benchmark.

Effect on the wiki

This audit does NOT change RA or CMM level criteria. It changes the defensibility of citations that support those criteria. Specifically:

  • D1 L4/L5 evidence (cost/risk metrics) gains WEF GCO 2026 + ENISA alongside IBM
  • D2 L3+ evidence (NHI scale and lifecycle) gains Verizon DBIR + SailPoint Horizons alongside CyberArk
  • D7 L4 evidence (red-team eval) gains AgentDojo as the independent third-party benchmark to anchor against vendor self-eval
  • D9 L4 evidence (HITL fatigue / human factors) shifts from talk + blog to ACM CS / MDPI / Springer 2025 academic foundation
  • Threat Classes 2026 Class 2 (APT campaigns) gains METR as the methodological foundation under UK AISI
  • Shadow AI / NHI pages should swap Knostic adoption-figure citation for Microsoft Work Trend Index 2025 directly

See Also