AI Incident Response Playbook
Document step-by-step procedures for identifying, containing, investigating, and resolving AI system incidents, including role assignments and escalation paths.
Objective
Enable fast, coordinated, and consistent responses to AI incidents by ensuring response procedures are documented and tested before incidents occur.
Maturity Levels
Initial
No AI incident response playbook exists; response is improvised.
Developing
General IT incident response procedures are used for AI incidents without AI-specific guidance.
Defined
An AI-specific incident response playbook covers detection, triage, containment, notification, and resolution for key incident types.
Managed
The playbook is tested through tabletop exercises annually; gaps identified in exercises are resolved.
Optimizing
Playbook is updated after every significant incident; response time metrics are tracked and improved.
Evidence Requirements
What an auditor or assessor would expect to see for this control.
- —AI incident response playbook document with role assignments, communication templates, and step-by-step response procedures by severity tier
- —Tabletop exercise or drill records showing the playbook was tested with key responders
- —Post-incident review records for actual incidents confirming the playbook was followed and documenting any deviations
- —Playbook review and update records showing it is revised after incidents and at least annually
- —Role assignment records confirming all named responders have acknowledged their responsibilities
Implementation Notes
Key steps
- Build the playbook around your incident taxonomy (IRC-001) — each incident type may require a different response sequence.
- Include model containment procedures: when should the model be taken offline vs. limited vs. monitored? The decision framework should be pre-defined.
- Define cross-functional response roles: AI incidents often require coordinated response from engineering, legal, privacy, communications, and business operations.
- Include regulatory notification procedures with clock-start definitions — ambiguity about when a notification obligation is triggered is a common compliance failure.
Example Implementation
Enterprise SaaS company responding to a confirmed P2 AI incident (discriminatory output pattern detected)
AI Incident Response — P2 Discrimination Pattern (IRC-INC-0031)
Detection: MON-003 bias monitoring flagged 8.3 pp approval rate disparity for Age 18–25 group (threshold: 5 pp) at 14:20 UTC
Response timeline:
| Time | Action | Owner |
|---|---|---|
| 14:20 | Automated alert fired; IRC-INC-0031 created | Monitoring system |
| 14:35 | AI Lead confirms alert is valid; classifies P2 | AI Lead |
| 14:40 | Model output rate-limited for affected segment pending investigation | Engineering on-call |
| 15:00 | Legal and Compliance notified | AI Lead |
| 15:30 | Root cause identified: training data underrepresentation of age group | ML Engineer |
| 16:00 | Mitigation deployed: fallback to rule-based decision for 18–25 segment | Engineering |
| T+2 days | Post-incident review scheduled | AI Governance Team |
| T+5 days | Model retrain plan approved | ML Lead |
Containment decision criteria used: Limit AI decision-making for affected demographic until disparity resolved; do not roll back entire model as other segments unaffected
Regulatory notification assessment: Documented — disparity identified and mitigated within same business day; no individual harm confirmed; no regulatory notification triggered
Control Details
- Control ID
- IRC-002
- Domain
- Incident Response
- Typical owner
- AI Governance Team / CISO / Legal
- Implementation effort
- Medium effort
- Agent-relevant
- Yes
