{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Module 10 — Advanced Reliability & Compaction\n",
    "\n",
    "Final module: keep long-running agents stable as the context window fills.\n",
    "\n",
    "- **Server-side compaction** (`compact_20260112`) replaces stale early history with a summary block once a token trigger fires.\n",
    "- **Case Facts pattern** keeps non-negotiable facts outside the compaction summary so they're never paraphrased away.\n",
    "- **Cache-System-Prompt pattern** preserves your prompt cache through compaction.\n",
    "- **Context-window awareness** lets you trigger compaction proactively at ~80% utilization.\n",
    "\n",
    "> **Beta header required:** `compact-2026-01-12`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import anthropic\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "load_dotenv()\n",
    "client = anthropic.Anthropic()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Enable server-side compaction with cache-stable system prompt\n",
    "\n",
    "Place a `cache_control` breakpoint at the **end of the system prompt** so it stays cached even after compaction rewrites the message history. Set `pause_after_compaction=True` if you want to inspect or augment the summary before the model resumes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "system_prompt = [\n",
    "    {\n",
    "        \"type\": \"text\",\n",
    "        \"text\": \"You are an expert Marketing Strategist. Use thinking to ensure your plans are data-driven.\",\n",
    "        \"cache_control\": {\"type\": \"ephemeral\"},   # cache breakpoint at the end of system prompt\n",
    "    }\n",
    "]\n",
    "\n",
    "response = client.messages.create(\n",
    "    model=\"claude-sonnet-4-6\",\n",
    "    max_tokens=4096,\n",
    "    system=system_prompt,\n",
    "    context_management={\n",
    "        \"strategy\": \"compact_20260112\",\n",
    "        \"trigger\": 50000,                  # min trigger; raise for chattier sessions\n",
    "        \"pause_after_compaction\": False,\n",
    "    },\n",
    "    messages=[{\"role\": \"user\", \"content\": \"Continue the campaign analysis from yesterday's notes.\"}],\n",
    "    extra_headers={\"anthropic-beta\": \"compact-2026-01-12\"},\n",
    ")\n",
    "\n",
    "print(\"stop_reason:\", response.stop_reason)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Case Facts — preserve non-negotiable details\n",
    "\n",
    "Compaction summarizes — and summaries lose precise transactional details. Carve those out into a `case_facts` block placed **after** the compaction block. The cell below is a tiny store you update each turn with newly confirmed data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "case_facts = {\n",
    "    \"signed_contract_value_usd\": 125000,\n",
    "    \"client_legal_entity\": \"Acme Health Holdings, Inc.\",\n",
    "    \"agreed_launch_date\": \"2026-06-01\",\n",
    "    \"baa_signed\": True,\n",
    "}\n",
    "\n",
    "def case_facts_block(facts: dict) -> dict:\n",
    "    lines = [\"<case_facts>\"] + [f\"- {k}: {v}\" for k, v in facts.items()] + [\"</case_facts>\"]\n",
    "    return {\"role\": \"user\", \"content\": \"\\n\".join(lines)}\n",
    "\n",
    "print(case_facts_block(case_facts)[\"content\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Proactive compaction at 80% utilization\n",
    "\n",
    "Sonnet 4.6 and Haiku 4.5 expose remaining context budget on the response. Trigger compaction yourself before you hit a hard limit, especially when adaptive thinking is on (thinking blocks now persist by default on Opus 4.5+ / Sonnet 4.6+ and accumulate fast)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "remaining = getattr(response.usage, \"context_window_remaining\", None)\n",
    "total = getattr(response.usage, \"context_window_total\", None)\n",
    "\n",
    "if remaining is not None and total:\n",
    "    used_pct = 100 * (1 - remaining / total)\n",
    "    print(f\"context used: {used_pct:.1f}% (remaining {remaining} / {total})\")\n",
    "    if used_pct >= 80:\n",
    "        print(\"-> at/over 80% — trigger compaction or hand off to a subagent.\")\n",
    "else:\n",
    "    print(\"context_window_remaining not exposed on this response; SDK/model may not surface it.\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
