Compliance & Escalation: Policy, Pricing, and the Human Handoff
This module establishes the guardrails of the system: how architects manage data-protection requirements, regional inference constraints, pricing implications, and the exact moment control must transition from an autonomous agent to a human operator.
1. Zero Data Retention (ZDR) & Compliance Matrix
Compliance is determined per feature. Architects must know which capabilities keep a workload ZDR-compliant and which break it.
1a. ZDR-Eligible Features
- Adaptive Thinking
- Citations
- Structured Outputs
- Standard Web Search/Fetch
- Compaction
1b. Non-Eligible Features
- Message Batches, because batch inputs and outputs are stored for 29 days.
- Files API
- Code Execution
- MCP Connector
1c. HIPAA & Schema Caching
PHI must never appear in JSON schema definitions, tool names, property names, or property descriptions because schemas are cached separately from message content. Field names should be generic: use record_id, not patient_ssn.
2. Regional Logic & The 1.1x Multiplier
Data residency requirements affect both compute location and billing.
2a. The Multiplier
Setting inference_geo: "us" restricts compute to US-based infrastructure but incurs a 1.1x pricing multiplier on all token categories for models starting with Claude Opus 4.6.
2b. Capacity Impact
The multiplier also applies to Priority Tier burndown. Each token processed draws down 1.1 tokens from committed capacity.
2c. Optimization Strategy
For cost-sensitive high-volume tasks, use Haiku 4.5, which is currently unaffected by this multiplier, and reserve Opus for strategic reasoning turns.
3. Explicit Escalation: Triggers as Policy
Escalation should be driven by written policy, not model vibes or sentiment alone.
3a. Hard Triggers
- Customer requests: any explicit request for a human must be honored immediately without re-routing or attempting further autonomous resolution.
- Policy gaps or silence: if a request touches an area where the provided policy is ambiguous or silent, such as a refund-window exception or competitor price match, the agent must escalate rather than improvising a new policy.
- High-stakes actions: use the PreToolUse hooks from Module 9 to catch and escalate irreversible actions before they execute.
Escalate immediately when:
- The customer asks for a human, manager, representative, or supervisor.
- The policy does not explicitly cover the requested exception.
- The requested action is irreversible, regulated, or above the configured approval threshold.
4. The Structured Handoff Protocol
A blind handoff frustrates users. A structured payload ensures the human operator can resolve the issue quickly.
The final act of the agent is to call an escalate_to_human tool with a strict schema.
{
"name": "escalate_to_human",
"description": "Transfer control to a human operator with enough context to resolve the issue without repeating failed steps.",
"strict": true,
"input_schema": {
"type": "object",
"properties": {
"customer_id": { "type": "string" },
"root_cause": { "type": "string" },
"actions_attempted": {
"type": "array",
"items": { "type": "string" }
},
"recommended_next_step": { "type": "string" }
},
"required": [
"customer_id",
"root_cause",
"actions_attempted",
"recommended_next_step"
],
"additionalProperties": false
}
}
Lab Exercise: Designing the Compliance Guardrail
Self-driven lab Module10_Self_Driven_Lab.ipynbObjective: master ZDR auditing, US-only inference configuration, deterministic escalation policy, and structured handoff protocols.
- ZDR audit: evaluate the Module 12 capstone architecture. Identify every component that breaks ZDR, such as Message Batches, Files API, Code Execution, or MCP Connector, and propose a synchronous-only alternative for a regulated client.
- Regional configuration: set up a Messages API request with
inference_geo: "us". Calculate the total token cost of a 1,000-token input/output turn using the 1.1x multiplier. - Escalation logic: add a policy-silent trigger to the system prompt. Test it by asking for a competitor price match and verify the agent escalates rather than inventing a discount.
- Handoff payload: implement the
escalate_to_humantool usingstrict: true. Ensure it capturescustomer_id,root_cause,actions_attempted, andrecommended_next_step. - Schema hardening: review tool definitions for PHI. Rename properties that contain sensitive identifiers to generic versions to prevent schema-caching risks.
Exam tip: ZDR, data residency, HIPAA schema safety, and escalation are separate decisions. A workflow can satisfy one and fail another.