Human Approval Gate for Irreversible Agent Actions
Require explicit human approval before an AI agent takes actions that are difficult or impossible to reverse, such as sending communications, modifying records, executing transactions, or deleting data.
Objective
Ensure humans retain control over consequential agent actions, preventing costly or harmful mistakes that cannot be undone automatically.
Maturity Levels
Initial
Agents execute all actions autonomously without approval gates.
Developing
Some high-risk actions require approval but the list is incomplete and not systematically maintained.
Defined
A documented list of action types requiring approval is enforced at the agent framework level, not just by convention.
Managed
Approval gate effectiveness is monitored; approval latency is tracked; queued actions that expire without approval are flagged.
Optimizing
The list of gated actions is continuously refined based on incident data; low-risk approved actions are progressively delegated back to the agent.
Evidence Requirements
What an auditor or assessor would expect to see for this control.
- —Versioned, approved list of gated action types with rationale for inclusion
- —Framework-level gate implementation evidence (code review, configuration) confirming gates are enforced before tool execution, not solely by model instruction
- —Approval queue records showing pending, approved, and expired approval requests with timestamps
- —Records of actions cancelled after timeout expiry and how they were re-queued or closed
- —Reviewer UI specification or screenshot evidence confirming full action context is presented before approval
Implementation Notes
Key steps
- Define irreversible action categories before deployment: sending messages, creating/modifying/deleting records, financial transactions, code deployments, and external API calls that trigger real-world effects.
- Implement gates at the framework level (before tool execution), not as instructions to the model — model-level instructions can be overridden by injections or reasoning errors.
- Design approval UIs to show the full action context, not just the action itself, so reviewers can assess whether it is appropriate.
- Set a timeout for pending approvals — stale approvals in a changed context are dangerous; require re-confirmation after a defined window.
Example Implementation
Marketing team using an AI agent to draft and send campaign emails and update CRM records
Irreversible Action Gate Configuration — Campaign Agent
Gated action types (human approval required before execution):
- Send email to external recipients
- Mass-update CRM contact records (> 10 records)
- Create or modify campaign workflows
- Publish content to external channels
- Delete any records
Non-gated actions (agent may execute autonomously):
- Read CRM data
- Draft email copy (stored as draft, not sent)
- Generate audience segment previews
- Create internal notes
Approval timeout: 2 hours — if approval is not confirmed within 2 hours, action is cancelled and re-queued for the next business day
Approval UI requirement: Reviewer must see the full action context (recipient list preview, email content, affected records) before confirming; summary-only views are not permitted
Control Details
- Control ID
- AGT-005
- Domain
- Agentic AI
- Typical owner
- AI Governance Team / AI Engineering
- Implementation effort
- Medium effort
- Agent-relevant
- Yes
