Multi-Agent Trust Hierarchy
Define explicit rules for which agents can instruct, invoke, or delegate authority to other agents in multi-agent systems.
Objective
Prevent privilege escalation and unauthorized action chains in multi-agent architectures by enforcing a documented, auditable trust model.
Maturity Levels
Initial
No trust hierarchy exists; agents can invoke other agents without restriction.
Developing
Trust relationships are informally understood by the engineering team but not documented or enforced at runtime.
Defined
A documented trust model specifies which agents can instruct which others, with enforced runtime checks.
Managed
Inter-agent authorization decisions are logged and reviewed; anomalous delegation patterns are alerted.
Optimizing
Trust relationships are dynamically verified using cryptographic attestation; trust model is reviewed after every architecture change.
Evidence Requirements
What an auditor or assessor would expect to see for this control.
- —Documented trust model specifying which agents may invoke which others and what actions each may accept from each caller
- —Inter-agent invocation logs with sender ID, recipient ID, instruction summary, timestamp, and resulting action for a sample period
- —Alert or investigation records for anomalous delegation patterns detected in production
- —Architecture review records confirming the trust model was reviewed and updated after each architecture change
- —Configuration or test evidence confirming agents cannot grant their own permissions to subagents without human or system approval
Implementation Notes
Key steps
- Treat inter-agent instructions with the same skepticism as external user inputs — an orchestrator agent can be compromised or injected, so subagents should not blindly execute its instructions.
- Implement an allowlist: each agent explicitly lists which other agents may invoke it and what actions it will accept from each.
- Log all inter-agent invocations with the initiating agent identity, the instruction passed, and the resulting action — this is essential for post-incident reconstruction.
- Avoid agents that can grant their own permissions to subagents; privilege escalation must require human or system approval.
Example Implementation
Software engineering pipeline using an orchestrator and specialist sub-agents
Multi-Agent Trust Policy — Development Pipeline
| Agent | Role | May Invoke | May Accept Instructions From |
|---|---|---|---|
| Orchestrator | Task decomposition and assignment | Code Writer, Reviewer, Test Runner | Human operator only |
| Code Writer | Feature implementation | None | Orchestrator only |
| Code Reviewer | Diff review and feedback | None | Orchestrator only |
| Test Runner | Test execution and reporting | None | Orchestrator only |
Enforcement rules:
- No agent may accept task instructions from an agent at equal or lower trust level
- The Orchestrator may not grant its own permissions or those of sub-agents at runtime; any capability expansion requires human approval
- Any instruction that would cause an agent to act outside its permitted scope must cause a halt and route to human review
Audit requirement: All inter-agent invocations logged with sender_id, recipient_id, instruction summary, timestamp, and resulting action
Control Details
- Control ID
- AGT-004
- Domain
- Agentic AI
- Typical owner
- AI Engineering / CISO
- Implementation effort
- High effort
- Agent-relevant
- Yes
