﻿{
    "cells":  [
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "# Reliability \u0026 Deterministic Enforcement\n",
                                     "\n",
                                     "This module focuses on Agent SDK hooks and structured error contracts that guarantee compliance and normalization even when the model\u0027s reasoning deviates from its instructions.\n",
                                     "\n",
                                     "Prompts are probabilistic. Hooks, gates, and validators are deterministic. Use prompts to guide model behavior, but use runtime controls to enforce policy, prerequisites, and normalized tool results.\n",
                                     "\n",
                                     "## 1. Deterministic vs. Probabilistic Enforcement\n",
                                     "\n",
                                     "Prompts can ask the model not to issue a high-risk refund. A **PreToolUse** hook can block the refund before it executes. This distinction matters for financial, legal, privacy, safety, and irreversible operations.\n",
                                     "\n",
                                     "### 1a. PreToolUse Interception\n",
                                     "\n",
                                     "Use **PreToolUse** to intercept outgoing tool calls before execution. This is the only way to guarantee that high-value actions, such as refunds over $500, are blocked and redirected to human escalation regardless of the model\u0027s intent."
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "REFUND_LIMIT = 500\n",
                                     "\n",
                                     "def pre_tool_use(tool_call):\n",
                                     "    if tool_call[\"name\"] != \"issue_refund\":\n",
                                     "        return {\"decision\": \"allow\"}\n",
                                     "\n",
                                     "    amount = tool_call[\"arguments\"].get(\"amount_usd\", 0)\n",
                                     "    if amount \u003e REFUND_LIMIT:\n",
                                     "        return {\n",
                                     "            \"decision\": \"deny\",\n",
                                     "            \"redirectTool\": \"escalate_to_human\",\n",
                                     "            \"reason\": \"Refunds over $500 require human approval.\",\n",
                                     "        }\n",
                                     "\n",
                                     "    return {\"decision\": \"allow\"}"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### 1b. Programmatic Prerequisite Gates\n",
                                     "\n",
                                     "`tool_choice` can force the model to call an identification tool first, but a programmatic gate is what blocks downstream tools until the prerequisite has actually succeeded.\n",
                                     "\n",
                                     "For example, block `process_refund` until `get_customer` or `identity_verification` has returned a verified success status. A model can ignore a prompt; it cannot bypass a runtime gate.\n",
                                     "\n",
                                     "## 2. Standardized Structured Error Contracts\n",
                                     "\n",
                                     "Uniform prose errors like \"Operation failed\" are anti-patterns because they prevent the agent from making informed recovery decisions. Every tool should return the same structured fields when it fails.\n",
                                     "\n",
                                     "### 2a. The `errorCategory` Enum\n",
                                     "\n",
                                     "Standardize tool errors into four categories:\n",
                                     "\n",
                                     "- `TRANSIENT`: timeouts or service unavailability; Claude should attempt a retry.\n",
                                     "- `VALIDATION`: invalid input, such as a bad email format; the agent should clarify with the user.\n",
                                     "- `PERMISSION`: authorization issues; the agent should inform the user of the access gap.\n",
                                     "- `BUSINESS`: policy violations, such as refund exceeds limit; the agent should communicate the rule.\n",
                                     "\n",
                                     "### 2b. Metadata Over Prose\n",
                                     "\n",
                                     "Always return an `isRetryable` boolean. This prevents the agent from wasting tokens on repeated attempts for non-retryable permission, validation, or business-rule failures.\n",
                                     "\n",
                                     "```json\n",
                                     "{\n",
                                     "  \"isError\": true,\n",
                                     "  \"errorCategory\": \"TRANSIENT\",\n",
                                     "  \"isRetryable\": true,\n",
                                     "  \"message\": \"Refund system is temporarily offline.\"\n",
                                     "}\n",
                                     "```\n",
                                     "\n",
                                     "```json\n",
                                     "{\n",
                                     "  \"isError\": true,\n",
                                     "  \"errorCategory\": \"BUSINESS\",\n",
                                     "  \"isRetryable\": false,\n",
                                     "  \"message\": \"Refunds over $500 require human approval.\"\n",
                                     "}\n",
                                     "```\n",
                                     "\n",
                                     "## 3. PostToolUse Normalization for MCP\n",
                                     "\n",
                                     "Different MCP servers often return inconsistent data formats: Unix timestamps, local date strings, ISO strings, numeric status codes, or provider-specific labels.\n",
                                     "\n",
                                     "Implement **PostToolUse** hooks to rewrite heterogeneous tool results into a homogeneous shape before they reach the model. This reduces thinking-budget consumption because the model does not have to reconcile formats manually."
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "from datetime import datetime, timezone\n",
                                     "\n",
                                     "def post_tool_use(tool_name, result):\n",
                                     "    if tool_name == \"crm_database_lookup\" and \"created_at_unix\" in result:\n",
                                     "        created_at = datetime.fromtimestamp(\n",
                                     "            result[\"created_at_unix\"],\n",
                                     "            tz=timezone.utc,\n",
                                     "        ).isoformat()\n",
                                     "        result[\"created_at\"] = created_at\n",
                                     "        del result[\"created_at_unix\"]\n",
                                     "\n",
                                     "    return result"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "## 4. Forced Prerequisite Strategy\n",
                                     "\n",
                                     "Use forced tool selection to implement deterministic multi-step sequences.\n",
                                     "\n",
                                     "On the first turn, set:\n",
                                     "\n",
                                     "```json\n",
                                     "{\n",
                                     "  \"tool_choice\": {\n",
                                     "    \"type\": \"tool\",\n",
                                     "    \"name\": \"get_customer\"\n",
                                     "  }\n",
                                     "}\n",
                                     "```\n",
                                     "\n",
                                     "This makes the identification step mandatory. On subsequent turns, switch back to `\"auto\"` or `\"any\"` so the agent can reason flexibly after the prerequisite evidence exists.\n",
                                     "\n",
                                     "Forced selection and hooks work together:\n",
                                     "\n",
                                     "- `tool_choice` ensures the first step is attempted.\n",
                                     "- `PreToolUse` gates ensure unsafe downstream tools cannot execute without verified prerequisite state.\n",
                                     "- Structured errors tell the model whether to retry, clarify, inform, or escalate.\n",
                                     "- `PostToolUse` normalization ensures the model reasons over stable result shapes.\n",
                                     "\n",
                                     "## Lab Exercise: The Defense-in-Depth Refund Workflow\n",
                                     "\n",
                                     "**Objective:** combine `tool_choice`, PreToolUse hooks, structured errors, and PostToolUse normalization to build a safe financial agent.\n",
                                     "\n",
                                     "1. **Prerequisite forcing:** use `tool_choice` to force the agent to call an `identity_verification` tool on its first turn.\n",
                                     "2. **Safety hook:** implement a PreToolUse hook that blocks `issue_refund` when `amount_usd` is over `$500`, even if the model\u0027s reasoning says it is authorized. Redirect the agent to `escalate_to_human`.\n",
                                     "3. **Structured error handling:** create a tool handler for a \"System Offline\" error that returns `isError: true`, `errorCategory: TRANSIENT`, and `isRetryable: true`; observe the agent attempting a recovery retry.\n",
                                     "4. **Normalization:** implement a PostToolUse hook that converts a Unix timestamp from an MCP database tool into a human-readable ISO string.\n",
                                     "5. **Validation vs. access:** test a \"User Not Found\" response as `isError: false` because it is a valid empty result, then compare it with an actual \"Access Denied\" error marked as `PERMISSION`.\n",
                                     "\n",
                                     "\u003e **Exam tip:** errors talk to the model; hooks talk to the runtime. Use structured errors to guide recovery, but use hooks and gates to enforce policy."
                                 ]
                  }
              ],
    "metadata":  {
                     "kernelspec":  {
                                        "display_name":  "Python 3",
                                        "language":  "python",
                                        "name":  "python3"
                                    },
                     "language_info":  {
                                           "codemirror_mode":  {
                                                                   "name":  "ipython",
                                                                   "version":  3
                                                               },
                                           "file_extension":  ".py",
                                           "mimetype":  "text/x-python",
                                           "name":  "python",
                                           "nbconvert_exporter":  "python",
                                           "pygments_lexer":  "ipython3",
                                           "version":  "3.11.0"
                                       }
                 },
    "nbformat":  4,
    "nbformat_minor":  5
}
