Agent Knowledge Source Integrity

Validate that documents, databases, and external sources retrieved by AI agents during task execution have not been tampered with, poisoned, or substituted with adversarial content.

Objective

Prevent agents from acting on manipulated or unauthorized information retrieved from knowledge bases, vector stores, the web, or third-party document sources.

Maturity Levels

Initial

Agents retrieve from any available source without validation; no controls exist on retrieval inputs.

Developing

Retrieval sources are informally restricted but not validated for integrity; no poisoning detection.

Defined

Approved source lists are defined per agent; retrieved documents pass a provenance check before being passed to the model.

Managed

Retrieval logs are audited periodically; anomalous retrieval patterns trigger investigation.

Optimizing

Automated integrity checks validate source hashes against known-good baselines; adversarial content patterns are detected and blocked in real time.

Evidence Requirements

What an auditor or assessor would expect to see for this control.

—Approved source registry per agent listing permitted retrieval sources with documented business justification and review date
—Provenance check records showing retrieved documents were validated against known-good baselines before being passed to the model
—Retrieval anomaly detection logs and investigation records for flagged retrieval events within a sample period
—Periodic integrity audit results confirming knowledge base contents against version-controlled reference copies
—Access control documentation showing who can modify knowledge base contents and under what approval process

Implementation Notes

Key steps

Maintain an approved source registry per agent: which URLs, document stores, vector databases, and APIs it may retrieve from; deny all unlisted sources.
Hash and version critical knowledge base documents; detect substitution by comparing retrieved content against known-good hashes before the content is passed to the model.
Treat retrieved content as untrusted input — apply the same scrutiny you would apply to user-submitted data, not the trust level of a system prompt.
Monitor retrieval volume and source patterns; a sudden shift in which documents are retrieved most often, or retrieval from new sources, can signal poisoning.
For web-retrieval agents, restrict permitted domains; validate SSL certificates; never follow open redirects to unvalidated endpoints.

Example Implementation

Legal team using a RAG agent to draft compliance summaries from regulatory documents

Knowledge Source Registry — Regulatory Compliance Agent

Permitted sources:

Source	Type	Integrity Check	Update Process
Internal regulatory library (SharePoint)	Document store	SHA-256 hash on ingest; re-verified on retrieval	Change-controlled; requires legal team approval
EUR-Lex official portal	Web retrieval	Domain allowlist only; certificate pinned	Automated nightly refresh with diff alert
NIST CSRC publications	Web retrieval	Domain allowlist only	Automated nightly refresh with diff alert

Denied by default: All other URLs, document stores, and APIs.

Anomaly alert: Any retrieval from a source not in the registry triggers an immediate alert to AI Engineering and is blocked.

Hash validation failure: If a retrieved document's hash does not match the ingest-time hash, retrieval is rejected and the discrepancy is logged for investigation.

Review cadence: Source registry reviewed quarterly or after any security incident involving the agent.

Control Details

Control ID: AGT-010
Domain: Agentic AI
Typical owner: AI Engineering / Security / Data Governance
Implementation effort: Medium effort
Agent-relevant: Yes

Get control updates weekly

New and updated controls, maturity guidance, and the regulatory changes behind them. Every Thursday.

Agent Knowledge Source Integrity

Maturity Levels

Evidence Requirements

Implementation Notes

Key steps

Example Implementation

Knowledge Source Registry — Regulatory Compliance Agent

Control Details

Tags

Mapped Regulations

Related Controls

Related Playbook

Recent Coverage