Post-Deployment Adversarial Testing Cadence
Schedule and execute recurring adversarial testing of production AI systems on a risk-tiered cadence, separate from and in addition to pre-deployment red-teaming.
Objective
Detect new attack surfaces, capability changes, and safety regressions that emerge after deployment — including those introduced by model updates, prompt changes, new user populations, and novel attack techniques discovered after go-live.
Maturity Levels
Initial
Adversarial testing occurs only pre-deployment; production systems are not retested.
Developing
Ad hoc adversarial testing occurs when incidents prompt concern; no scheduled cadence exists.
Defined
A post-deployment adversarial testing schedule is defined, tiered by system risk level; test plans are documented and results are recorded.
Managed
Test results are compared against prior runs to track safety trajectory; regressions trigger remediation before the next release cycle.
Optimizing
Adversarial test suites update continuously with newly published attack techniques; automated regression tests run on every model update in production.
Evidence Requirements
What an auditor or assessor would expect to see for this control.
- —Post-deployment adversarial testing schedule with risk-tiered cadence documentation
- —Test plans for each production system showing attack categories covered
- —Executed test reports with findings, severity ratings, and remediation records
- —Comparison reports showing safety trajectory across test cycles
- —Evidence that test suites are updated with newly published attack techniques
Implementation Notes
Key steps
- Establish a testing cadence tied to risk tier: Critical systems tested quarterly, Significant semi-annually, Limited annually at minimum.
- Separate post-deployment test suites from pre-deployment suites — post-deployment testing should include attack patterns that have emerged since initial deployment.
- Include in each test: prompt injection, jailbreak attempts using techniques published since last test, data exfiltration attempts, and output manipulation targeting your specific use case.
- For high-risk sectors (healthcare, financial services), include sector-specific adversarial scenarios relevant to your regulatory environment.
- Treat adversarial test reports as board-reportable for Critical systems — if a new attack pattern succeeds against a production system, that is a governance event.
Example Implementation
Customer-facing AI assistant for a financial services firm
Post-Deployment Adversarial Test Report — Q2 2026
System: Customer Service AI Assistant (Critical tier) Test date: 2026-05-28 | Prior test: 2026-02-19
New attack techniques tested since last cycle:
- Many-shot jailbreaking (published Feb 2026) — FAILED (blocked by guardrails)
- Crescendo multi-turn escalation — PARTIAL PASS (escalation blocked at turn 4, not turn 2 as expected)
- Financial advice boundary probing (sector-specific) — PASSED
Regressions vs. prior test: None identified. New finding: Crescendo resistance degraded. Guardrail threshold requires tuning. Remediation: Guardrail update deployed 2026-05-30. Retest scheduled 2026-06-05.
