Module 10

Compliance & Escalation: Policy, Pricing, and the Human Handoff

This module establishes the guardrails of the system: how architects manage data-protection requirements, regional inference constraints, pricing implications, and the exact moment control must transition from an autonomous agent to a human operator.

Answer key Module10_Complete.ipynb

1. Zero Data Retention (ZDR) & Compliance Matrix

Compliance is determined per feature. Architects must know which capabilities keep a workload ZDR-compliant and which break it.

1a. ZDR-Eligible Features

  • Adaptive Thinking
  • Citations
  • Structured Outputs
  • Standard Web Search/Fetch
  • Compaction

1b. Non-Eligible Features

  • Message Batches, because batch inputs and outputs are stored for 29 days.
  • Files API
  • Code Execution
  • MCP Connector

1c. HIPAA & Schema Caching

PHI must never appear in JSON schema definitions, tool names, property names, or property descriptions because schemas are cached separately from message content. Field names should be generic: use record_id, not patient_ssn.

2. Regional Logic & The 1.1x Multiplier

Data residency requirements affect both compute location and billing.

2a. The Multiplier

Setting inference_geo: "us" restricts compute to US-based infrastructure but incurs a 1.1x pricing multiplier on all token categories for models starting with Claude Opus 4.6.

2b. Capacity Impact

The multiplier also applies to Priority Tier burndown. Each token processed draws down 1.1 tokens from committed capacity.

2c. Optimization Strategy

For cost-sensitive high-volume tasks, use Haiku 4.5, which is currently unaffected by this multiplier, and reserve Opus for strategic reasoning turns.

3. Explicit Escalation: Triggers as Policy

Escalation should be driven by written policy, not model vibes or sentiment alone.

3a. Hard Triggers

  • Customer requests: any explicit request for a human must be honored immediately without re-routing or attempting further autonomous resolution.
  • Policy gaps or silence: if a request touches an area where the provided policy is ambiguous or silent, such as a refund-window exception or competitor price match, the agent must escalate rather than improvising a new policy.
  • High-stakes actions: use the PreToolUse hooks from Module 9 to catch and escalate irreversible actions before they execute.
System prompt policy fragment
Escalate immediately when:
- The customer asks for a human, manager, representative, or supervisor.
- The policy does not explicitly cover the requested exception.
- The requested action is irreversible, regulated, or above the configured approval threshold.

4. The Structured Handoff Protocol

A blind handoff frustrates users. A structured payload ensures the human operator can resolve the issue quickly.

The final act of the agent is to call an escalate_to_human tool with a strict schema.

JSON (strict handoff schema)
{
  "name": "escalate_to_human",
  "description": "Transfer control to a human operator with enough context to resolve the issue without repeating failed steps.",
  "strict": true,
  "input_schema": {
    "type": "object",
    "properties": {
      "customer_id": { "type": "string" },
      "root_cause": { "type": "string" },
      "actions_attempted": {
        "type": "array",
        "items": { "type": "string" }
      },
      "recommended_next_step": { "type": "string" }
    },
    "required": [
      "customer_id",
      "root_cause",
      "actions_attempted",
      "recommended_next_step"
    ],
    "additionalProperties": false
  }
}

Lab Exercise: Designing the Compliance Guardrail

Self-driven lab Module10_Self_Driven_Lab.ipynb

Objective: master ZDR auditing, US-only inference configuration, deterministic escalation policy, and structured handoff protocols.

  1. ZDR audit: evaluate the Module 12 capstone architecture. Identify every component that breaks ZDR, such as Message Batches, Files API, Code Execution, or MCP Connector, and propose a synchronous-only alternative for a regulated client.
  2. Regional configuration: set up a Messages API request with inference_geo: "us". Calculate the total token cost of a 1,000-token input/output turn using the 1.1x multiplier.
  3. Escalation logic: add a policy-silent trigger to the system prompt. Test it by asking for a competitor price match and verify the agent escalates rather than inventing a discount.
  4. Handoff payload: implement the escalate_to_human tool using strict: true. Ensure it captures customer_id, root_cause, actions_attempted, and recommended_next_step.
  5. Schema hardening: review tool definitions for PHI. Rename properties that contain sensitive identifiers to generic versions to prevent schema-caching risks.

Exam tip: ZDR, data residency, HIPAA schema safety, and escalation are separate decisions. A workflow can satisfy one and fail another.