{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Final Project — Enterprise Prospecting & Outreach Orchestrator\n",
    "\n",
    "Capstone that ties Modules 1–10 into one production system:\n",
    "\n",
    "- **Phase 1** — Coordinator Agent with adaptive thinking, US data residency, and `effort=\"xhigh\"`.\n",
    "- **Phase 2** — Knowledge mapping with the MCP Connector and `.mcp.json` env-var substitution.\n",
    "- **Phase 3** — Hub-and-spoke subagent research with structured error responses.\n",
    "- **Phase 4** — Synthesis with the Advisor tool, Message Batches at 50% off, 300k extended output, structured extraction.\n",
    "- **Phase 5** — Reliability via PreToolUse hooks, server-side compaction, and the case-facts pattern.\n",
    "- **Phase 6** — Repository governance with path-scoped rules and a forked-context audit skill.\n",
    "\n",
    "**How to use this notebook**\n",
    "\n",
    "1. Run **Setup**.\n",
    "2. Work through **Code Starters** — fill in the `TODO` bodies yourself.\n",
    "3. Compare to the **Answer Key** sections that follow.\n",
    "\n",
    "> Some calls in this notebook hit beta endpoints (`mcp-client-2025-11-20`, `compact-2026-01-12`, `output-300k-2026-03-24`, `advisor-tool-2026-03-01`). Make sure your account has access before running a cell against the live API."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import time\n",
    "import json\n",
    "\n",
    "import anthropic\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "load_dotenv()\n",
    "client = anthropic.Anthropic()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "# Code Starters\n",
    "\n",
    "Skeletons for the four core building blocks of the orchestrator. Fill in each `TODO` before peeking at the answer key. Each starter cell raises `NotImplementedError` so it's obvious which pieces are still stubs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# PHASE 1: Coordinator Agent Definition\n",
    "def create_coordinator():\n",
    "    \"\"\"Create a Managed Agent with client.beta.agents.create.\n",
    "\n",
    "    Required configuration:\n",
    "      - model=\"claude-opus-4-7\"\n",
    "      - description and system prompt suited to orchestrating prospecting work\n",
    "      - agent_toolset_20260401 with web_search and web_fetch enabled\n",
    "    Per-call requirements (you'll set these when you invoke the agent, not here):\n",
    "      - thinking={\"type\": \"adaptive\"}\n",
    "      - effort=\"xhigh\"   # Opus 4.7 only\n",
    "      - inference_geo=\"us\"  # 1.1x pricing multiplier\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"TODO: implement create_coordinator\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# PHASES 2 & 3: Hub-and-Spoke Orchestration with MCP\n",
    "def run_research_session(agent_id, prospect_query):\n",
    "    \"\"\"Open a stateful session and dispatch a coordinator turn that spawns\n",
    "    parallel research subagents in a single message.\n",
    "\n",
    "    Required:\n",
    "      - client.beta.sessions.create with mcp_servers describing the CRM endpoint\n",
    "      - authorization_token sourced from os.environ[\"CRM_TOKEN\"]\n",
    "      - one client.beta.sessions.events.send with a user.message that instructs\n",
    "        the coordinator to call the Task tool 3+ times in parallel\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"TODO: implement run_research_session\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# PHASE 4: Synthesis & Scaled Outreach (Message Batches)\n",
    "def generate_batch_campaign(prospect_data_list):\n",
    "    \"\"\"Submit a Message Batches job for high-volume outreach.\n",
    "\n",
    "    Required:\n",
    "      - betas=[\"output-300k-2026-03-24\"] for extended output\n",
    "      - one Request per prospect with a unique custom_id\n",
    "      - tools list includes advisor_20260301 with model=\"claude-opus-4-7\"\n",
    "      - output_config with a JSON schema that uses nullable fields for\n",
    "        annual_revenue and budget_range to prevent hallucination\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"TODO: implement generate_batch_campaign\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# PHASE 5: Reliability & Compaction\n",
    "def configure_reliability(messages):\n",
    "    \"\"\"Build a messages.create payload that includes:\n",
    "      - thinking={\"type\": \"adaptive\"}\n",
    "      - effort=\"xhigh\"\n",
    "      - inference_geo=\"us\"\n",
    "      - context_management with compact_20260112 and trigger.input_tokens=50000\n",
    "      - the compact-2026-01-12 beta header\n",
    "      - a system prompt with a cache_control breakpoint at the end\n",
    "    \"\"\"\n",
    "    raise NotImplementedError(\"TODO: implement configure_reliability\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "# Answer Key\n",
    "\n",
    "Reference implementation. Each section corresponds to one phase of the project."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase 1 — Agent Definition & Data Governance\n",
    "\n",
    "The Managed Agent bundles the static config (model, system, tools). The dynamic per-call parameters — adaptive thinking, US data residency, `effort=\"xhigh\"` — are set when you invoke the agent later, not at agent creation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "coordinator_agent = client.beta.agents.create(\n",
    "    name=\"Enterprise Research Coordinator\",\n",
    "    model=\"claude-opus-4-7\",\n",
    "    description=\"Orchestrates prospecting research and outreach strategy.\",\n",
    "    system=(\n",
    "        \"You are the lead orchestrator. Use the Task tool to delegate to \"\n",
    "        \"specialist subagents. Always maintain a 'case facts' block with \"\n",
    "        \"non-negotiable transactional data.\"\n",
    "    ),\n",
    "    tools=[\n",
    "        {\n",
    "            \"type\": \"agent_toolset_20260401\",\n",
    "            \"configs\": [\n",
    "                {\"name\": \"web_search\", \"enabled\": True, \"permission_policy\": {\"type\": \"always_allow\"}},\n",
    "                {\"name\": \"web_fetch\",  \"enabled\": True, \"permission_policy\": {\"type\": \"always_allow\"}},\n",
    "            ],\n",
    "        }\n",
    "    ],\n",
    ")\n",
    "\n",
    "print(\"agent id:\", coordinator_agent.id, \"version:\", coordinator_agent.version)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase 2 — Knowledge Mapping with MCP & Resources\n",
    "\n",
    "Scaffold the `.mcp.json` for local development and write a runtime-loadable env-var substitutor. **Never commit the rendered file** — commit only the template, then materialize at deploy time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "mcp_template = '''{\n",
    "  \"mcpServers\": {\n",
    "    \"crm\": {\n",
    "      \"command\": \"node\",\n",
    "      \"args\": [\"./mcp-servers/crm-server.js\"],\n",
    "      \"env\": {\n",
    "        \"CRM_TOKEN\": \"${CRM_TOKEN}\",\n",
    "        \"CRM_BASE_URL\": \"${CRM_BASE_URL}\"\n",
    "      }\n",
    "    }\n",
    "  }\n",
    "}\n",
    "'''\n",
    "\n",
    "Path(\"module11_demo\").mkdir(exist_ok=True)\n",
    "Path(\"module11_demo/.mcp.json.example\").write_text(mcp_template, encoding=\"utf-8\")\n",
    "print(\"wrote module11_demo/.mcp.json.example — commit this; never the resolved file\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase 3 — Multi-Agent Research Workflow\n",
    "\n",
    "Open a stateful session, attach the CRM via MCP, and instruct the Coordinator to spawn three Task subagents in a single turn. Subagents run with **isolated context** — pass them only what they need.\n",
    "\n",
    "Then define the structured-error contract every subagent tool must follow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "session = client.beta.sessions.create(\n",
    "    agent_id=coordinator_agent.id,\n",
    "    mcp_servers=[\n",
    "        {\n",
    "            \"name\": \"crm-server\",\n",
    "            \"type\": \"url\",\n",
    "            \"url\": \"https://mcp.your-enterprise.com/crm\",\n",
    "            \"authorization_token\": os.environ.get(\"CRM_TOKEN\", \"DEMO_NOT_SET\"),\n",
    "        }\n",
    "    ],\n",
    ")\n",
    "\n",
    "client.beta.sessions.events.send(\n",
    "    session_id=session.id,\n",
    "    event={\n",
    "        \"type\": \"user.message\",\n",
    "        \"content\": [{\n",
    "            \"type\": \"text\",\n",
    "            \"text\": (\n",
    "                \"Research AI adoption in Fintech and Healthcare simultaneously. \"\n",
    "                \"Spawn three Task subagents in parallel: web_research, \"\n",
    "                \"financial_analysis, competitor_tracking. Pass each only the \"\n",
    "                \"prospect summary they need. Reconcile their outputs into a \"\n",
    "                \"single brief and update the case_facts block before responding.\"\n",
    "            ),\n",
    "        }],\n",
    "    },\n",
    ")\n",
    "\n",
    "print(\"session id:\", session.id)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def subagent_tool_handler(tool_input):\n",
    "    \"\"\"Every subagent tool returns either {data: ...} on success or a structured\n",
    "    error envelope. The Coordinator inspects errorCategory + isRetryable to\n",
    "    decide whether to retry, pivot, or escalate.\n",
    "    \"\"\"\n",
    "    try:\n",
    "        result = run_research(tool_input)  # noqa: F821 (your real implementation)\n",
    "        return {\"data\": result}\n",
    "    except TimeoutError:\n",
    "        return {\n",
    "            \"errorCategory\": \"API_TIMEOUT\",\n",
    "            \"isRetryable\": True,\n",
    "            \"message\": \"Research API timed out. Retry with a narrower query.\",\n",
    "        }\n",
    "    except PermissionError:\n",
    "        return {\n",
    "            \"errorCategory\": \"INVALID_PERMISSIONS\",\n",
    "            \"isRetryable\": False,\n",
    "            \"message\": \"Token lacks access. Coordinator should pivot, not retry.\",\n",
    "        }\n",
    "\n",
    "print(\"subagent_tool_handler defined.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase 4 — Synthesis & Scaled Outreach\n",
    "\n",
    "Pair the Sonnet 4.6 executor with the Opus 4.7 advisor inside one batch request. Use `output-300k-2026-03-24` for book-length intelligence reports, and constrain the final output to a JSON schema with nullable fields so missing data comes back as `null` instead of an invented placeholder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from anthropic.types.message_create_params import MessageCreateParamsNonStreaming\n",
    "from anthropic.types.messages.batch_create_params import Request\n",
    "\n",
    "prospect_schema = {\n",
    "    \"type\": \"object\",\n",
    "    \"properties\": {\n",
    "        \"company_name\":   {\"type\": \"string\"},\n",
    "        \"contact_email\":  {\"type\": \"string\"},\n",
    "        \"annual_revenue\": {\"type\": [\"integer\", \"null\"]},  # nullable: may be unknown\n",
    "        \"budget_range\":   {\"type\": [\"string\", \"null\"]},   # nullable: may be unknown\n",
    "    },\n",
    "    \"required\": [\"company_name\", \"contact_email\", \"annual_revenue\", \"budget_range\"],\n",
    "    \"additionalProperties\": False,\n",
    "}\n",
    "\n",
    "message_batch = client.beta.messages.batches.create(\n",
    "    betas=[\"output-300k-2026-03-24\"],\n",
    "    requests=[\n",
    "        Request(\n",
    "            custom_id=\"outreach-campaign-001\",\n",
    "            params=MessageCreateParamsNonStreaming(\n",
    "                model=\"claude-sonnet-4-6\",\n",
    "                max_tokens=300000,\n",
    "                messages=[{\n",
    "                    \"role\": \"user\",\n",
    "                    \"content\": \"Generate a 100-page market intelligence report for Fintech CTOs.\",\n",
    "                }],\n",
    "                tools=[{\n",
    "                    \"type\": \"advisor_20260301\",\n",
    "                    \"name\": \"advisor\",\n",
    "                    \"model\": \"claude-opus-4-7\",\n",
    "                    \"effort\": \"xhigh\",\n",
    "                    \"caching\": {\"type\": \"ephemeral\", \"ttl\": \"1h\"},\n",
    "                }],\n",
    "                output_config={\"format\": {\"type\": \"json_schema\", \"schema\": prospect_schema}},\n",
    "            ),\n",
    "        )\n",
    "    ],\n",
    ")\n",
    "\n",
    "print(\"batch id:\", message_batch.id)  # always begins with msgbatch_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Poll until the batch ends, then map results back via custom_id.\n",
    "while True:\n",
    "    snap = client.messages.batches.retrieve(message_batch.id)\n",
    "    if snap.processing_status == \"ended\":\n",
    "        break\n",
    "    print(\"in progress:\", snap.request_counts)\n",
    "    time.sleep(60)\n",
    "\n",
    "for result in client.messages.batches.results(message_batch.id):\n",
    "    if result.result.type == \"succeeded\":\n",
    "        body = result.result.message.content[0].text\n",
    "        record = json.loads(body)        # guaranteed valid by the schema\n",
    "        print(result.custom_id, \"->\", record)\n",
    "    else:\n",
    "        print(result.custom_id, \"->\", result.result.type)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase 5 — Production Reliability & Context Hygiene\n",
    "\n",
    "Server-side compaction at 50k tokens, with explicit instructions to preserve case facts verbatim. Place a `cache_control` breakpoint at the end of the system prompt so the prompt cache survives compaction.\n",
    "\n",
    "After the call, re-count tokens to verify the effective context size shrank as expected."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "system_prompt = [\n",
    "    {\n",
    "        \"type\": \"text\",\n",
    "        \"text\": (\n",
    "            \"You are the Enterprise Research Coordinator. Maintain a <case_facts> \"\n",
    "            \"block at the top of every turn with non-negotiable transactional data.\"\n",
    "        ),\n",
    "        \"cache_control\": {\"type\": \"ephemeral\"},   # cache breakpoint at end of system prompt\n",
    "    }\n",
    "]\n",
    "\n",
    "case_facts = {\n",
    "    \"client_legal_entity\": \"Acme Health Holdings, Inc.\",\n",
    "    \"max_budget_usd\": 50000,\n",
    "    \"agreed_launch_date\": \"2026-09-01\",\n",
    "    \"baa_signed\": True,\n",
    "    \"no_ai_contact_optout\": False,\n",
    "}\n",
    "case_facts_block = {\n",
    "    \"role\": \"user\",\n",
    "    \"content\": \"<case_facts>\\n\" + \"\\n\".join(f\"- {k}: {v}\" for k, v in case_facts.items()) + \"\\n</case_facts>\",\n",
    "}\n",
    "\n",
    "conversation = [\n",
    "    case_facts_block,\n",
    "    {\"role\": \"user\", \"content\": \"Summarize the latest research findings and propose next outreach steps.\"},\n",
    "]\n",
    "\n",
    "response = client.messages.create(\n",
    "    model=\"claude-opus-4-7\",\n",
    "    max_tokens=16000,\n",
    "    system=system_prompt,\n",
    "    thinking={\"type\": \"adaptive\"},\n",
    "    effort=\"xhigh\",\n",
    "    inference_geo=\"us\",                         # 1.1x multiplier; ZDR-eligible\n",
    "    context_management={\n",
    "        \"edits\": [\n",
    "            {\n",
    "                \"type\": \"compact_20260112\",\n",
    "                \"trigger\": {\"input_tokens\": 50000},\n",
    "                \"instructions\": \"Summarize research findings but preserve 'case facts' verbatim.\",\n",
    "            }\n",
    "        ]\n",
    "    },\n",
    "    extra_headers={\"anthropic-beta\": \"compact-2026-01-12\"},\n",
    "    messages=conversation,\n",
    ")\n",
    "\n",
    "print(\"stop_reason:\", response.stop_reason)\n",
    "print(\"input/output tokens:\", response.usage.input_tokens, \"/\", response.usage.output_tokens)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Token accounting after compaction: count what would be sent on the next turn.\n",
    "next_turn = conversation + [\n",
    "    {\"role\": \"assistant\", \"content\": response.content},\n",
    "    {\"role\": \"user\", \"content\": \"Draft outreach for the top 3 Healthcare prospects.\"},\n",
    "]\n",
    "\n",
    "size = client.messages.count_tokens(\n",
    "    model=\"claude-opus-4-7\",\n",
    "    system=system_prompt,\n",
    "    messages=next_turn,\n",
    ")\n",
    "print(\"effective context tokens after compaction:\", size.input_tokens)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Deterministic safety: PreToolUse hook\n",
    "\n",
    "Block any `process_outreach` call until the prospect's `risk_score` has been verified. This is config the Claude Code harness reads, not a runtime API call."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hooks_config = '''{\n",
    "  \"hooks\": {\n",
    "    \"PreToolUse\": [\n",
    "      {\n",
    "        \"matcher\": {\"tool_name\": \"process_outreach\"},\n",
    "        \"hooks\": [\n",
    "          {\n",
    "            \"type\": \"command\",\n",
    "            \"command\": \"python verify_risk_score.py\"\n",
    "          }\n",
    "        ]\n",
    "      }\n",
    "    ]\n",
    "  }\n",
    "}\n",
    "'''\n",
    "\n",
    "Path(\"module11_demo/hooks.json\").write_text(hooks_config, encoding=\"utf-8\")\n",
    "print(\"wrote module11_demo/hooks.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Phase 6 — Claude Code & Repository Governance\n",
    "\n",
    "Two artifacts go into the repo so every developer (and Claude itself) follows the same standards: a path-scoped rule for outreach files, and a forked-context audit skill."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rules_dir = Path(\"module11_demo/.claude/rules\")\n",
    "rules_dir.mkdir(parents=True, exist_ok=True)\n",
    "\n",
    "(rules_dir / \"outreach-template.md\").write_text(\n",
    "    \"\"\"---\n",
    "globs: [\"outreach/**/*.md\", \"outreach/**/*.html\"]\n",
    "---\n",
    "# Standard Outreach Template Rules\n",
    "- Open with the prospect's confirmed pain point from CRM, never a generic hook.\n",
    "- Include one quantified case study from the same industry vertical.\n",
    "- Close with a single, specific call to action (no menu of options).\n",
    "- Never reference compliance regimes the prospect has not opted into.\n",
    "\"\"\",\n",
    "    encoding=\"utf-8\",\n",
    ")\n",
    "\n",
    "skill_dir = Path(\"module11_demo/.claude/skills/generate-audit\")\n",
    "skill_dir.mkdir(parents=True, exist_ok=True)\n",
    "\n",
    "(skill_dir / \"SKILL.md\").write_text(\n",
    "    \"\"\"---\n",
    "name: generate-audit\n",
    "description: Compliance audit on generated outreach emails\n",
    "context: fork\n",
    "allowed-tools: [read_file, web_search]\n",
    "---\n",
    "# Audit Skill\n",
    "1. Read the email under review.\n",
    "2. Cross-check claims against the prospect's stated regulated regimes.\n",
    "3. Flag any unsupported compliance claims (HIPAA, SOC 2, ISO 27001, etc.).\n",
    "4. Return a structured JSON report with findings and severity.\n",
    "\"\"\",\n",
    "    encoding=\"utf-8\",\n",
    ")\n",
    "\n",
    "print(\"wrote\", rules_dir / \"outreach-template.md\")\n",
    "print(\"wrote\", skill_dir / \"SKILL.md\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## Final Audit Checklist\n",
    "\n",
    "- **ZDR check:** adaptive thinking, citations, structured outputs, and standard web search are ZDR-eligible. Message Batches and the MCP Connector are **not** — batch results sit server-side for 29 days.\n",
    "- **Cost check:** `inference_geo: \"us\"` adds a 1.1× multiplier on Opus 4.6+ (input, output, and cache); same multiplier hits Priority-Tier burndown.\n",
    "- **Token management:** call `client.messages.count_tokens` on the next outgoing payload after compaction fires to verify the effective context size shrank.\n",
    "- **Citations vs. structured outputs:** mutually exclusive. Sending both returns a 400.\n",
    "- **Subagents:** isolated context by default — the Coordinator must explicitly forward only the data each subagent needs."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
