Inference Exposure (and Retrieval Exposure)
Two paired AI-specific failure modes that bypass traditional file/network access controls. Together they describe a class of risk that does not exist in non-AI systems.
Inference Exposure
A user gains unauthorized insights from AI outputs without ever accessing the original documents.
The user cannot open the source files. Their RBAC role does not permit it. But the AI was trained on, retrieved, or summarized those files for some other purpose — and the user is able to ask questions whose answers depend on that material. The AI infers and exposes what the user could not access directly.
This mode does not exist in non-AI systems: there is no file-server analog where reading an authorized file leaks knowledge of an unauthorized one.
Retrieval Exposure
Content retrieved by the AI to craft an output is used beyond what the user is authorized to see.
A RAG pipeline pulls fragments from a corpus the user has some access to. The composition of those fragments — their joint meaning, the inferences enabled by their combination — exceeds the per-document access policy. The output is a derivative work that no individual permission grant authorized.
Why traditional controls miss this
| Control | What it checks | Why it misses inference / retrieval exposure |
|---|---|---|
| File ACLs | Can the user open this specific file? | The user never opens the file. |
| Network firewall | Can this IP reach this service? | The retrieval is allowed; the exfiltration is via answer text. |
| DLP on file egress | Is sensitive content being uploaded/emailed? | The AI output is the user’s own answer, not a document copy. |
| RBAC on documents | Does the role permit this document type? | Each retrieved fragment may be permitted; the combination is not. |
What works
The mitigation is semantic boundary enforcement: evaluate the meaning of retrieved or generated content at answer time, not just the per-document permission at retrieval time. This is the design space AI Usage Control occupies and that oversharing controls operationalize for AI-search products like Microsoft Copilot, Glean, Gemini.
Three concrete control patterns:
- Need-to-know enforcement at the knowledge layer. Combine RBAC + ABAC + sensitivity labels + real-time output filters. The check happens at answer time, after retrieval but before delivery to the user.
- Continuous authorization within a session. A one-time check at session start is insufficient — the conversation can drift across permission boundaries as it proceeds.
- Provenance and audit trail. Capture prompt + retrieved context + applied policies + model output, every time, so post-incident investigation can reconstruct what the AI knew and what it disclosed.
Empirical evidence
- Multi-turn prompt-leakage rates rose from 17.7% to 86.2% under specific attack patterns (ACL 2024 systematic investigation).
- Defense techniques reduced prompt-extraction attacks 83.8% on Llama2-7B and 71.0% on GPT-3.5 (arxiv 2408.02416) — non-trivial but not solved.
- 48% of employees admit uploading sensitive corporate data into public AI tools (cited in AI Data Security (Knostic blog, 2026)) — primary contributor to retrieval exposure in shadow-AI deployments.
See Also
- Oversharing Controls for AI Search — the practice that operationalizes inference- and retrieval-exposure mitigation
- UCON for AI) — UCON-based controls evaluated at answer time
- Three Retrieval Paths for Injection Payloads — vector/full-text/metadata; paths 2 and 3 are the practical inference-exposure risk vectors
- Indirect Prompt Injection — adjacent failure mode (untrusted text steers the agent’s retrieval/output)
- AI Data Security (Knostic blog, 2026) — primary source
- Security Controls for AI Stacks — data layer (D6 Data, Memory & RAG)