Post-Deployment Adversarial Testing Cadence

Schedule and execute recurring adversarial testing of production AI systems on a risk-tiered cadence, separate from and in addition to pre-deployment red-teaming.

Objective

Detect new attack surfaces, capability changes, and safety regressions that emerge after deployment — including those introduced by model updates, prompt changes, new user populations, and novel attack techniques discovered after go-live.

Maturity Levels

Initial

Adversarial testing occurs only pre-deployment; production systems are not retested.

Developing

Ad hoc adversarial testing occurs when incidents prompt concern; no scheduled cadence exists.

Defined

A post-deployment adversarial testing schedule is defined, tiered by system risk level; test plans are documented and results are recorded.

Managed

Test results are compared against prior runs to track safety trajectory; regressions trigger remediation before the next release cycle.

Optimizing

Adversarial test suites update continuously with newly published attack techniques; automated regression tests run on every model update in production.

Evidence Requirements

What an auditor or assessor would expect to see for this control.

—Post-deployment adversarial testing schedule with risk-tiered cadence documentation
—Test plans for each production system showing attack categories covered
—Executed test reports with findings, severity ratings, and remediation records
—Comparison reports showing safety trajectory across test cycles
—Evidence that test suites are updated with newly published attack techniques

Implementation Notes

Key steps

Establish a testing cadence tied to risk tier: Critical systems tested quarterly, Significant semi-annually, Limited annually at minimum.
Separate post-deployment test suites from pre-deployment suites — post-deployment testing should include attack patterns that have emerged since initial deployment.
Include in each test: prompt injection, jailbreak attempts using techniques published since last test, data exfiltration attempts, and output manipulation targeting your specific use case.
For high-risk sectors (healthcare, financial services), include sector-specific adversarial scenarios relevant to your regulatory environment.
Treat adversarial test reports as board-reportable for Critical systems — if a new attack pattern succeeds against a production system, that is a governance event.

Example Implementation

Customer-facing AI assistant for a financial services firm

Post-Deployment Adversarial Test Report — Q2 2026

System: Customer Service AI Assistant (Critical tier) Test date: 2026-05-28 | Prior test: 2026-02-19

New attack techniques tested since last cycle:

Many-shot jailbreaking (published Feb 2026) — FAILED (blocked by guardrails)
Crescendo multi-turn escalation — PARTIAL PASS (escalation blocked at turn 4, not turn 2 as expected)
Financial advice boundary probing (sector-specific) — PASSED

Regressions vs. prior test: None identified. New finding: Crescendo resistance degraded. Guardrail threshold requires tuning. Remediation: Guardrail update deployed 2026-05-30. Retest scheduled 2026-06-05.

Control Details

Control ID: SAF-006
Domain: Safety & Reliability
Typical owner: Security / AI Red Team
Implementation effort: High effort
Agent-relevant: Yes

Get control updates weekly

New and updated controls, maturity guidance, and the regulatory changes behind them. Every Thursday.

Post-Deployment Adversarial Testing Cadence

Maturity Levels

Evidence Requirements

Implementation Notes

Key steps

Example Implementation

Post-Deployment Adversarial Test Report — Q2 2026

Control Details

Tags

Related Controls

Related Playbook

Recent Coverage