Source Triangulation Audit — 2026-05-02

Closes peer-review-readiness honorable-mention #1: “Statistics drawn from a narrow source set — Gartner, Insight Partners, Knostic, OWASP — same handful. No academic literature or industry surveys beyond these.” Per-claim triangulation against academic and government-survey sources, with explicit verdict on what is corroborated, what’s contested, and what remains single-sourced.

Headline result

Of 8 audited load-bearing claims, 5 are corroborated by independent academic or government-survey sources after this pass, 2 are partially corroborated (direction confirmed, exact figures single-sourced), and 3 remain genuinely single-sourced and should be labeled as such in the wiki. Five new sources (Stanford HAI AI Index, Verizon DBIR, METR, WEF Global Cybersecurity Outlook, AgentDojo) join the wiki’s citation rotation.

Per-claim triangulation

Claim 1 — NHI scale and growth (82:1 ratio, 400% growth)

Wiki citations today	CyberArk (82:1), Rubrik Zero Labs (45:1), arXiv 2503.18255 (50K → 250K, 2021–2025)
Independent corroboration	Verizon DBIR 2025 — third-party breach involvement doubled YoY (15% → 30%) driven by ungoverned machine accounts; SailPoint Horizons of Identity Security 2025–2026 — 69% of orgs have more machine than human identities, ~half deploy 10×; GitGuardian/Verizon analysis — 441,780 exposed secrets in public repos, 39% web-app infra; ENISA Threat Landscape 2025 — confirms identity/credential acceleration.
Verdict	Partially corroborated. Direction (NHI ≫ human, growing fast) is confirmed across 5+ independent sources. The exact 82:1 figure is single-vendor (CyberArk Vanson Bourne, n=2,600). The wiki should cite the directional claim with multi-source backing and label “82:1” as CyberArk-specific.

Wiki citations today

CyberArk (82:1), Rubrik Zero Labs (45:1), arXiv 2503.18255 (50K → 250K, 2021–2025)

Independent corroboration

Verizon DBIR 2025 — third-party breach involvement doubled YoY (15% → 30%) driven by ungoverned machine accounts; SailPoint Horizons of Identity Security 2025–2026 — 69% of orgs have more machine than human identities, ~half deploy 10×; GitGuardian/Verizon analysis — 441,780 exposed secrets in public repos, 39% web-app infra; ENISA Threat Landscape 2025 — confirms identity/credential acceleration.

Verdict

Partially corroborated. Direction (NHI ≫ human, growing fast) is confirmed across 5+ independent sources. The exact 82:1 figure is single-vendor (CyberArk Vanson Bourne, n=2,600). The wiki should cite the directional claim with multi-source backing and label “82:1” as CyberArk-specific.

Claim 2 — AI agent adoption rates and shadow AI

Wiki citations today	Knostic (75% knowledge-worker GenAI use, 78% BYOAI) — vendor blog
Independent corroboration	Microsoft Work Trend Index 2025 — the actual primary source for the 75%/78% numbers Knostic cites; Stanford HAI AI Index 2025 — 78% business AI adoption (up from 55% in 2023); McKinsey State of AI Nov 2025 — 88% of orgs use AI in ≥1 function (n=1,993, 105 countries); BCG “Build for the Future 2025” — 72% regular use, only 13% have AI agents in production; IBM Cost of Data Breach 2025 — 20% had shadow-AI breaches, $670K cost premium.
Verdict	Corroborated, but the wiki is citing a downstream vendor when the original is government-survey-quality. Replace Knostic citations with Microsoft Work Trend Index 2025 directly; add Stanford HAI for the cross-year curve and BCG for the agentic vs GenAI gap (the 13%-vs-72% delta is load-bearing).

Wiki citations today

Knostic (75% knowledge-worker GenAI use, 78% BYOAI) — vendor blog

Independent corroboration

Microsoft Work Trend Index 2025 — the actual primary source for the 75%/78% numbers Knostic cites; Stanford HAI AI Index 2025 — 78% business AI adoption (up from 55% in 2023); McKinsey State of AI Nov 2025 — 88% of orgs use AI in ≥1 function (n=1,993, 105 countries); BCG “Build for the Future 2025” — 72% regular use, only 13% have AI agents in production; IBM Cost of Data Breach 2025 — 20% had shadow-AI breaches, $670K cost premium.

Verdict

Corroborated, but the wiki is citing a downstream vendor when the original is government-survey-quality. Replace Knostic citations with Microsoft Work Trend Index 2025 directly; add Stanford HAI for the cross-year curve and BCG for the agentic vs GenAI gap (the 13%-vs-72% delta is load-bearing).

Claim 3 — Agent-orchestrated attack capability scaling (8-month cyber-task doubling)

Wiki citations today	UK AISI Frontier AI Trends Report; Anthropic GTG-1002
Independent corroboration	METR “Measuring AI Ability to Complete Long Tasks” (arXiv 2503.14499) — generalist task horizon doubles every ~7 months (2019–2025), accelerated to ~4 months in 2024–2025; this is the foundational methodology UK AISI’s 8-month cyber figure builds on; Apollo Research “More Capable Models Are Better at In-Context Scheming” — capability scaling correlates with scheming/deception; CSET “Autonomous Cyber Defense” — government think-tank assessment.
Verdict	Corroborated. Strong triangulation — METR (peer-reviewable methodology), UK AISI (government measurement), Apollo (capability-scheming correlation). Wiki should add METR as the methodological foundation rather than citing UK AISI standalone.

Wiki citations today

UK AISI Frontier AI Trends Report; Anthropic GTG-1002

Independent corroboration

METR “Measuring AI Ability to Complete Long Tasks” (arXiv 2503.14499) — generalist task horizon doubles every ~7 months (2019–2025), accelerated to ~4 months in 2024–2025; this is the foundational methodology UK AISI’s 8-month cyber figure builds on; Apollo Research “More Capable Models Are Better at In-Context Scheming” — capability scaling correlates with scheming/deception; CSET “Autonomous Cyber Defense” — government think-tank assessment.

Verdict

Corroborated. Strong triangulation — METR (peer-reviewable methodology), UK AISI (government measurement), Apollo (capability-scheming correlation). Wiki should add METR as the methodological foundation rather than citing UK AISI standalone.

Claim 4 — MCP CVE explosion (30+ CVEs in 60 days, 82% path traversal, 66% code injection)

Wiki citations today	Single-source claim; origin unspecified, suggests vendor blog
Independent corroboration	arXiv 2503.23278 — “MCP: Landscape, Security Threats, and Future Research Directions” — peer-reviewed (ACM TOSEM); 4-phase / 16-activity threat taxonomy. arXiv 2506.13538 — “MCP at First Glance” — empirical study of 1,899 OSS MCP servers, 66% have code smells, 14.4% bug patterns. arXiv 2510.16558 — “Toward Understanding Security Issues in the MCP Ecosystem” — 67,057 servers across 6 registries. VulnerableMCP database (vulnerablemcp.info). eSentire reports ~22% path traversal in tested servers (different from wiki’s 82%).
Verdict	Contested. Existence of MCP CVE wave well-corroborated; specific percentages do not match peer-reviewed denominators. The wiki’s “66% code injection” looks like it may be a misread of the arXiv “66% code smells” figure on a different sample. Action: re-derive percentages from arXiv 2503.23278 / 2506.13538 / 2510.16558 / VulnerableMCP, or label as “[vendor-source]-derived; academic surveys report different denominators.”

Wiki citations today

Single-source claim; origin unspecified, suggests vendor blog

Independent corroboration

arXiv 2503.23278 — “MCP: Landscape, Security Threats, and Future Research Directions” — peer-reviewed (ACM TOSEM); 4-phase / 16-activity threat taxonomy. arXiv 2506.13538 — “MCP at First Glance” — empirical study of 1,899 OSS MCP servers, 66% have code smells, 14.4% bug patterns. arXiv 2510.16558 — “Toward Understanding Security Issues in the MCP Ecosystem” — 67,057 servers across 6 registries. VulnerableMCP database (vulnerablemcp.info). eSentire reports ~22% path traversal in tested servers (different from wiki’s 82%).

Verdict

Contested. Existence of MCP CVE wave well-corroborated; specific percentages do not match peer-reviewed denominators. The wiki’s “66% code injection” looks like it may be a misread of the arXiv “66% code smells” figure on a different sample. Action: re-derive percentages from arXiv 2503.23278 / 2506.13538 / 2510.16558 / VulnerableMCP, or label as “[vendor-source]-derived; academic surveys report different denominators.”

Claim 5 — Prompt injection / detection rates

Wiki citations today	Anthropic Constitutional Classifiers (86%→4.4%); Promptfoo blog (94%→71% on GPT-4o → GPT-4.1); LlamaFirewall PromptGuard 2 (97.5% recall, 1% FPR); Bullen talk (Llama 3.3 70B 6.7%, Claude 3.7 Sonnet 1.5%)
Independent corroboration	AgentDojo (NeurIPS / arXiv 2406.13352) — independent prompt-injection benchmark, 97 tasks / 629 security cases; best agents <25% attack success; tool-filtering drops to 7.5%. InjecAgent (arXiv 2403.02691) — indirect-prompt-injection benchmark; ReAct GPT-4 vulnerable in 24% of cases. WASP (arXiv 2504.18575) — web-agent security benchmark. Trendyol/Medium — independent demonstration that PromptGuard 2 is bypassable. Constitutional Classifiers (arXiv 2501.18837 + ++ at arXiv 2601.04603) — 3,000+ red-team hours; v2 has 40× compute reduction, 0.05% refusal rate.
Verdict	Corroborated, with caveat. Vendor self-eval numbers are confirmed in their own papers, but AgentDojo is the cleanest independent comparator (Meta’s own evaluation uses it: PromptGuard 2 takes ASR 17.6%→7.5%, combined with AlignmentCheck 1.75%). Wiki should distinguish “vendor self-reported recall” from “independent benchmark ASR.” Bullen-talk-specific 6.7%/1.5% figures don’t appear in independent benchmarks — flag as anecdotal.

Wiki citations today

Anthropic Constitutional Classifiers (86%→4.4%); Promptfoo blog (94%→71% on GPT-4o → GPT-4.1); LlamaFirewall PromptGuard 2 (97.5% recall, 1% FPR); Bullen talk (Llama 3.3 70B 6.7%, Claude 3.7 Sonnet 1.5%)

Independent corroboration

AgentDojo (NeurIPS / arXiv 2406.13352) — independent prompt-injection benchmark, 97 tasks / 629 security cases; best agents <25% attack success; tool-filtering drops to 7.5%. InjecAgent (arXiv 2403.02691) — indirect-prompt-injection benchmark; ReAct GPT-4 vulnerable in 24% of cases. WASP (arXiv 2504.18575) — web-agent security benchmark. Trendyol/Medium — independent demonstration that PromptGuard 2 is bypassable. Constitutional Classifiers (arXiv 2501.18837 + ++ at arXiv 2601.04603) — 3,000+ red-team hours; v2 has 40× compute reduction, 0.05% refusal rate.

Verdict

Corroborated, with caveat. Vendor self-eval numbers are confirmed in their own papers, but AgentDojo is the cleanest independent comparator (Meta’s own evaluation uses it: PromptGuard 2 takes ASR 17.6%→7.5%, combined with AlignmentCheck 1.75%). Wiki should distinguish “vendor self-reported recall” from “independent benchmark ASR.” Bullen-talk-specific 6.7%/1.5% figures don’t appear in independent benchmarks — flag as anecdotal.

Claim 6 — Cost of AI security breaches ( $4.44 M g l o ba l,$ 10.22M US)

Wiki citations today	IBM 2025 Cost of a Data Breach Report
Independent corroboration	WEF Global Cybersecurity Outlook 2026 — 87% identify AI vulnerabilities as fastest-growing cyber risk; 94% say AI is dominant change driver; ~33% lack any AI pre-deployment security review. Verizon DBIR 2025 — 30% third-party involvement (2× YoY); 72% of AI-tool users sign in with personal email. ENISA Threat Landscape 2025 — 4,875 EU incidents (Jul 2024–Jun 2025). IBM 2025 (full) — shadow-AI breaches $4.63 M ($ 670K premium); 13% had AI breaches; 97% lacked AI access controls; 63% have no AI governance policy.
Verdict	Well-corroborated. Add WEF GCO 2026 and ENISA Threat Landscape explicitly to avoid IBM single-citation on the cost narrative.

Wiki citations today

IBM 2025 Cost of a Data Breach Report

Independent corroboration

WEF Global Cybersecurity Outlook 2026 — 87% identify AI vulnerabilities as fastest-growing cyber risk; 94% say AI is dominant change driver; ~33% lack any AI pre-deployment security review. Verizon DBIR 2025 — 30% third-party involvement (2× YoY); 72% of AI-tool users sign in with personal email. ENISA Threat Landscape 2025 — 4,875 EU incidents (Jul 2024–Jun 2025). IBM 2025 (full) — shadow-AI breaches

4.63 M (

670K premium); 13% had AI breaches; 97% lacked AI access controls; 63% have no AI governance policy.

Verdict

Well-corroborated. Add WEF GCO 2026 and ENISA Threat Landscape explicitly to avoid IBM single-citation on the cost narrative.

Claim 7 — HITL fatigue / approval rubber-stamping

Wiki citations today	Bullen talk; Changkun blog post
Independent corroboration	CSET “AI Safety and Automation Bias” (Nov 2024) — government think-tank issue brief on automation bias as AI safety failure mode. ACM Computing Surveys, “Alert Fatigue in Security Operations Centres” — peer-reviewed canonical SOC alert-fatigue citation. MDPI Computers (2024), “Automation Bias and Complacency in SOCs” — peer-reviewed empirical work. Springer AI & Society (2025), “Exploring automation bias in human–AI collaboration” — systematic review of 35 peer-reviewed studies (2015–April 2025). EDPS TechDispatch #2/2025 — EU regulator perspective. Bansal et al. — finding that explanations can increase over-reliance, contradicting the “explainable AI → less rubber-stamping” assumption.
Verdict	Single-sourced becomes well-corroborated. Cleanest “vendor/blog → academic” upgrade in this audit. Replace Bullen + Changkun with ACM Computing Surveys + MDPI + Springer 2025 systematic review. CSET adds the government-policy framing.

Wiki citations today

Bullen talk; Changkun blog post

Independent corroboration

CSET “AI Safety and Automation Bias” (Nov 2024) — government think-tank issue brief on automation bias as AI safety failure mode. ACM Computing Surveys, “Alert Fatigue in Security Operations Centres” — peer-reviewed canonical SOC alert-fatigue citation. MDPI Computers (2024), “Automation Bias and Complacency in SOCs” — peer-reviewed empirical work. Springer AI & Society (2025), “Exploring automation bias in human–AI collaboration” — systematic review of 35 peer-reviewed studies (2015–April 2025). EDPS TechDispatch #2/2025 — EU regulator perspective. Bansal et al. — finding that explanations can increase over-reliance, contradicting the “explainable AI → less rubber-stamping” assumption.

Verdict

Single-sourced becomes well-corroborated. Cleanest “vendor/blog → academic” upgrade in this audit. Replace Bullen + Changkun with ACM Computing Surveys + MDPI + Springer 2025 systematic review. CSET adds the government-policy framing.

Claim 8 — AI safety scheming / deception

Wiki citations today	Apollo Research; OpenAI/Apollo joint paper
Independent corroboration	Apollo “Frontier Models are Capable of In-context Scheming” (arXiv 2412.04984) — o1 maintains deception >85% across follow-ups; tests 5 frontier models. Anthropic + Redwood “Sleeper Agents” (arXiv 2401.05566) — backdoor persistence through safety training; foundational. Anthropic + Redwood “Alignment Faking in LLMs” (arXiv 2412.14093) — first empirical evidence of unprompted alignment faking. Apollo “Stress Testing Deliberative Alignment for Anti-Scheming Training” — newer follow-up the wiki should track. UK AISI Frontier Trends 2025 — government corroboration.
Verdict	Partially corroborated; venue caveat. Most papers are arXiv preprints from labs (Apollo, Anthropic, Redwood, OpenAI) — strong methodologies, but no peer-reviewed NeurIPS/ICML/ICLR paper has independently replicated the o1 “>85% deception persistence” figure as of mid-2026. Wiki should explicitly label “lab evaluation, peer-review pending, independent replication outstanding.”

Wiki citations today

Apollo Research; OpenAI/Apollo joint paper

Independent corroboration

Apollo “Frontier Models are Capable of In-context Scheming” (arXiv 2412.04984) — o1 maintains deception >85% across follow-ups; tests 5 frontier models. Anthropic + Redwood “Sleeper Agents” (arXiv 2401.05566) — backdoor persistence through safety training; foundational. Anthropic + Redwood “Alignment Faking in LLMs” (arXiv 2412.14093) — first empirical evidence of unprompted alignment faking. Apollo “Stress Testing Deliberative Alignment for Anti-Scheming Training” — newer follow-up the wiki should track. UK AISI Frontier Trends 2025 — government corroboration.

Verdict

Partially corroborated; venue caveat. Most papers are arXiv preprints from labs (Apollo, Anthropic, Redwood, OpenAI) — strong methodologies, but no peer-reviewed NeurIPS/ICML/ICLR paper has independently replicated the o1 “>85% deception persistence” figure as of mid-2026. Wiki should explicitly label “lab evaluation, peer-review pending, independent replication outstanding.”

Summary table

Claim	Verdict
1. NHI scale (82:1)	Partially corroborated — direction confirmed; exact figure CyberArk-specific
2. AI agent adoption (75%/78%)	Corroborated — replace Knostic with Microsoft WTI 2025 + Stanford HAI
3. 8-month cyber-task doubling	Corroborated — METR is the methodological foundation under UK AISI
4. MCP CVE percentages (82% / 66%)	Contested — academic surveys report different denominators
5. Prompt-injection detection rates	Corroborated for vendor self-eval; AgentDojo is the independent comparator
6. AI breach cost ( $4.44 M /$ 10.22M)	Well-corroborated — add WEF GCO and ENISA
7. HITL fatigue	Single-sourced → well-corroborated — clean academic upgrade
8. AI scheming / deception	Partially corroborated — lab self-reported, peer review pending

Top 5 sources to add to the citation rotation

Source	URL	Why it matters
Stanford HAI AI Index Report 2025	hai.stanford.edu/ai-index/2025-ai-index-report	Annual academic-grade synthesis covering adoption, capability benchmarks, agent eval (RE-Bench), investment, policy. Recurs across Claims 2, 3, 8
Verizon Data Breach Investigations Report 2025	verizon.com/business/resources/reports/dbir	Industry-standard incidence data with disclosed methodology. Recurs across Claims 1, 2, 6
METR “Measuring AI Ability to Complete Long Tasks”	arxiv.org/abs/2503.14499 + Time Horizon 1.1	Methodological foundation under UK AISI’s 8-month cyber-doubling figure. Independent, transparent, ongoing. Recurs across Claims 3, 5
WEF Global Cybersecurity Outlook 2026	reports.weforum.org/docs/WEF_Global_Cybersecurity_Outlook_2026.pdf	Senior-leader survey with disclosed methodology; complements IBM cost framing. Recurs across Claims 2, 6
AgentDojo	arxiv.org/abs/2406.13352	Independent peer-reviewed (NeurIPS) prompt-injection benchmark used by Meta. The wiki currently has zero independent benchmarks for Claim 5. Recurs across Claims 4, 5

Sources to flag as vendor-conflicted

Source	Conflict
CyberArk State of Machine Identity Security	CyberArk sells machine-identity products; the 82:1 ratio is from a Vanson Bourne survey CyberArk commissioned
SailPoint Horizons / Machine Identity Crisis	SailPoint sells identity governance; useful for directional corroboration only
Knostic blog (75%/78% adoption)	Knostic is a data-loss vendor; figures are downstream of Microsoft WTI — replace with Microsoft directly
Salesforce Connectivity Benchmark	Salesforce sells Agentforce; 67% multi-agent surge directly markets their product
Promptfoo blog (regression numbers)	Promptfoo (now part of OpenAI) sells eval tooling; ASR figures are model-marketing-adjacent

Genuinely single-sourced after this pass

Three claims remain effectively single-sourced or contested:

MCP CVE percentages (Claim 4) — vendor blog origin; academic surveys report different denominators. Action: re-derive from peer-reviewed denominators, or downgrade to “vendor-source-derived.”
Specific scheming rates (Claim 8) — lab self-reported; no peer-reviewed independent replication. Action: keep claims, label “peer review pending.”
Bullen-talk-specific ASR figures (Claim 5) — 6.7% Llama 3.3 70B / 1.5% Claude 3.7 Sonnet don’t appear in any independent benchmark. Action: demote to anecdotal observation or re-derive on a public benchmark.

Effect on the wiki

This audit does NOT change RA or CMM level criteria. It changes the defensibility of citations that support those criteria. Specifically:

D1 L4/L5 evidence (cost/risk metrics) gains WEF GCO 2026 + ENISA alongside IBM
D2 L3+ evidence (NHI scale and lifecycle) gains Verizon DBIR + SailPoint Horizons alongside CyberArk
D7 L4 evidence (red-team eval) gains AgentDojo as the independent third-party benchmark to anchor against vendor self-eval
D9 L4 evidence (HITL fatigue / human factors) shifts from talk + blog to ACM CS / MDPI / Springer 2025 academic foundation
Threat Classes 2026 Class 2 (APT campaigns) gains METR as the methodological foundation under UK AISI
Shadow AI / NHI pages should swap Knostic adoption-figure citation for Microsoft Work Trend Index 2025 directly

Enterprise Security in the Agentic AI Era

Explorer

Source Triangulation Audit — 2026-05-02

Source Triangulation Audit — 2026-05-02

Per-claim triangulation

Claim 1 — NHI scale and growth (82:1 ratio, 400% growth)

Claim 2 — AI agent adoption rates and shadow AI

Claim 3 — Agent-orchestrated attack capability scaling (8-month cyber-task doubling)

Claim 4 — MCP CVE explosion (30+ CVEs in 60 days, 82% path traversal, 66% code injection)

Claim 5 — Prompt injection / detection rates

Claim 6 — Cost of AI security breaches ( $4.44 M g l o ba l,$ 10.22M US)

Claim 7 — HITL fatigue / approval rubber-stamping

Claim 8 — AI safety scheming / deception

Summary table

Top 5 sources to add to the citation rotation

Sources to flag as vendor-conflicted

Genuinely single-sourced after this pass

Effect on the wiki

See Also

Graph View

Table of Contents

Backlinks

Enterprise Security in the Agentic AI Era

Explorer

Source Triangulation Audit — 2026-05-02

Source Triangulation Audit — 2026-05-02

Per-claim triangulation

Claim 1 — NHI scale and growth (82:1 ratio, 400% growth)

Claim 2 — AI agent adoption rates and shadow AI

Claim 3 — Agent-orchestrated attack capability scaling (8-month cyber-task doubling)

Claim 4 — MCP CVE explosion (30+ CVEs in 60 days, 82% path traversal, 66% code injection)

Claim 5 — Prompt injection / detection rates

Claim 6 — Cost of AI security breaches (4.44Mglobal,10.22M US)

Claim 7 — HITL fatigue / approval rubber-stamping

Claim 8 — AI safety scheming / deception

Summary table

Top 5 sources to add to the citation rotation

Sources to flag as vendor-conflicted

Genuinely single-sourced after this pass

Effect on the wiki

See Also

Graph View

Table of Contents

Backlinks

Claim 6 — Cost of AI security breaches ( $4.44 M g l o ba l,$ 10.22M US)