{
    "cells":  [
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "*Module 8 Self-Driven Lab*\r\n",
                                     "\r\n",
                                     "# Prompt Engineering for Precision: Criteria, Constraints, and Structured Output\r\n",
                                     "\r\n",
                                     "**Objective:** master categorical criteria, union-type constraints, prerequisite forcing, and validation loops.\r\n",
                                     "\r\n",
                                     "## Challenge Outline\r\n",
                                     "\r\n",
                                     "Build a complete notebook that demonstrates the following outcomes:\r\n",
                                     "\r\n",
                                     "- **Explicit classification:** create a Lead Quality classification tool using numeric thresholds such as `Revenue \u003e $10M`; include one-line examples for each category.\r\n",
                                     "- **Few-shot generalization:** provide 2-4 extraction examples. Each example must include a `Why:` line explaining why specific data was mapped to a field.\r\n",
                                     "- **Union-type constraints:** design a Prospect Profile schema with 5 nullable fields and verify missing data returns `null`, not placeholder text.\r\n",
                                     "- **Extensible categorization:** design a schema where `category` is an enum that includes `other`; require a separate `category_detail` string when `other` is selected.\r\n",
                                     "- **Prerequisite forcing:** use `tool_choice: {\"type\": \"tool\", \"name\": \"extract_metadata\"}` to ensure metadata extraction runs before enrichment.\r\n",
                                     "- **Validation-retry loop:** check for a semantic error, such as line items not summing to a total. If validation fails, send a follow-up request with the original document and the specific validation error to guide self-correction.\r\n",
                                     "\r\n",
                                     "Your solution should include enough code, output, or written observations to prove each outcome worked. Keep the notebook focused on final behavior and evidence rather than a guided walkthrough.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "## Student Workspace\n",
                                     "\n",
                                     "Use the sections below to build your solution. Each section maps to one required outcome from the challenge outline.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### Part 1: Explicit classification\n",
                                     "\n",
                                     "create a Lead Quality classification tool using numeric thresholds such as `Revenue \u003e $10M`; include one-line examples for each category.\n"
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "# Part 1: Explicit classification\n",
                                     "# Add your implementation, outputs, or notes here.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### Part 2: Few-shot generalization\n",
                                     "\n",
                                     "provide 2-4 extraction examples. Each example must include a `Why:` line explaining why specific data was mapped to a field.\n"
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "# Part 2: Few-shot generalization\n",
                                     "# Add your implementation, outputs, or notes here.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### Part 3: Union-type constraints\n",
                                     "\n",
                                     "design a Prospect Profile schema with 5 nullable fields and verify missing data returns `null`, not placeholder text.\n"
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "# Part 3: Union-type constraints\n",
                                     "# Add your implementation, outputs, or notes here.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### Part 4: Extensible categorization\n",
                                     "\n",
                                     "design a schema where `category` is an enum that includes `other`; require a separate `category_detail` string when `other` is selected.\n"
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "# Part 4: Extensible categorization\n",
                                     "# Add your implementation, outputs, or notes here.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### Part 5: Prerequisite forcing\n",
                                     "\n",
                                     "use `tool_choice: {\"type\": \"tool\", \"name\": \"extract_metadata\"}` to ensure metadata extraction runs before enrichment.\n"
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "# Part 5: Prerequisite forcing\n",
                                     "# Add your implementation, outputs, or notes here.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### Part 6: Validation-retry loop\n",
                                     "\n",
                                     "check for a semantic error, such as line items not summing to a total. If validation fails, send a follow-up request with the original document and the specific validation error to guide self-correction.\n"
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "# Part 6: Validation-retry loop\n",
                                     "# Add your implementation, outputs, or notes here.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "### Verification Notes\n",
                                     "\n",
                                     "Summarize the evidence that each part worked. Capture API signals, validation outcomes, errors, recovery behavior, cost observations, or comparisons required by this lab.\n"
                                 ]
                  },
                  {
                      "cell_type":  "code",
                      "execution_count":  null,
                      "metadata":  {

                                   },
                      "outputs":  [

                                  ],
                      "source":  [
                                     "# Verification notes\n",
                                     "# Record the evidence that proves each lab outcome worked.\n"
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "---\n",
                                     "\n",
                                     "## Answer Key\n",
                                     "\n",
                                     "The cells below contain the completed reference implementation/content for this module. Use this section only after attempting the self-driven lab."
                                 ]
                  },
                  {
                      "cell_type":  "markdown",
                      "metadata":  {

                                   },
                      "source":  [
                                     "*Module 8 of 12*\n",
                                     "\n",
                                     "# Prompt Engineering for Precision: Criteria, Constraints, and Structured Output\n",
                                     "\n",
                                     "This module provides the advanced prompting techniques required for production-grade extraction and classification, ensuring that Claude\u0027s routing and output match a target schema 100% of the time. The goal is to move from fuzzy instructions to deterministic precision: explicit criteria, few-shot reasoning, strict schema constraints, hallucination-resistant nullable fields, and validation-retry loops.\n",
                                     "\n",
                                     "\n",
                                     "## 1. Explicit Categorical Criteria vs. Vague Instructions\n",
                                     "\n",
                                     "Vague instructions like \"be conservative\" or \"only report high-confidence findings\" fail because they do not define objective thresholds. Replace adjectives with numeric or behavioral criteria.\n",
                                     "\n",
                                     "*anti-pattern*\n",
                                     "```text\n",
                                     "Classify lead quality as low, medium, or high. Be conservative.\n",
                                     "```\n",
                                     "\n",
                                     "*precise criteria*\n",
                                     "```text\n",
                                     "HIGH\n",
                                     "- Revenue is greater than $10M, AND\n",
                                     "- Company has an active AI initiative or open data/ML roles.\n",
                                     "Example: \"$25M revenue and hiring an ML platform lead.\"\n",
                                     "\n",
                                     "MEDIUM\n",
                                     "- Revenue is $2M-$10M, OR\n",
                                     "- Revenue is unknown but company has a clear AI adoption signal.\n",
                                     "Example: \"$6M revenue and recently published an AI case study.\"\n",
                                     "\n",
                                     "LOW\n",
                                     "- Revenue is below $2M, OR\n",
                                     "- No AI adoption signal is present.\n",
                                     "Example: \"$900k revenue and no technical hiring signal.\"\n",
                                     "\n",
                                     "If evidence supports two categories, choose the higher category.\n",
                                     "If revenue and AI signal are both missing, output needs_clarification.\n",
                                     "```\n",
                                     "\n",
                                     "### 1a. Behavioral Thresholds and Escape Hatches\n",
                                     "\n",
                                     "Behavioral thresholds make classification reproducible: \"production is down for all users\" is stronger than \"severe outage.\" Escape hatches like `needs_clarification` prevent the model from forcing a fuzzy input into a false category.\n",
                                     "\n",
                                     "## 2. Few-Shot Examples for Generalization\n",
                                     "\n",
                                     "Few-shot examples are the best way to handle ambiguous cases where two tools, fields, or categories look reasonable.\n",
                                     "\n",
                                     "### 2a. The Reasoning Column\n",
                                     "\n",
                                     "High-quality few-shot examples include a `Why:` line. The reasoning teaches the rule behind the pattern, so the model generalizes to novel cases rather than merely matching examples.\n",
                                     "\n",
                                     "*few-shot extraction examples*\n",
                                     "```text\n",
                                     "Example 1\n",
                                     "Input: \"Acme reports $14M ARR and is hiring a VP of AI Platform.\"\n",
                                     "Output: {\"lead_quality\": \"HIGH\", \"annual_revenue\": \"$14M\", \"ai_signal\": \"hiring VP of AI Platform\"}\n",
                                     "Why: Revenue is greater than $10M and the hiring signal confirms active AI investment.\n",
                                     "\n",
                                     "Example 2\n",
                                     "Input: \"BetaCo launched a chatbot pilot, but revenue is not disclosed.\"\n",
                                     "Output: {\"lead_quality\": \"MEDIUM\", \"annual_revenue\": null, \"ai_signal\": \"chatbot pilot\"}\n",
                                     "Why: Revenue is missing, but a clear AI adoption signal qualifies for MEDIUM rather than LOW.\n",
                                     "\n",
                                     "Example 3\n",
                                     "Input: \"Gamma LLC has $800k revenue and no AI-related hiring or projects.\"\n",
                                     "Output: {\"lead_quality\": \"LOW\", \"annual_revenue\": \"$800k\", \"ai_signal\": null}\n",
                                     "Why: Revenue is below $2M and there is no AI signal.\n",
                                     "```\n",
                                     "\n",
                                     "### 2b. Negative Triggers\n",
                                     "\n",
                                     "Include examples that distinguish acceptable patterns from genuine issues. Negative examples reduce false positives by teaching what *not* to flag or extract.\n",
                                     "\n",
                                     "## 3. Structured Output Pillars \u0026 Technical Constraints\n",
                                     "\n",
                                     "Structured outputs constrain Claude to a specific JSON schema, eliminating syntax errors such as trailing commas, missing fields, and surprise keys.\n",
                                     "\n",
                                     "- **JSON outputs (`output_config.format`):** constrain the final answer to a JSON schema.\n",
                                     "- **Strict tool use (`strict: true`):** guarantees tool inputs follow your schema exactly through grammar-constrained sampling.\n",
                                     "\n",
                                     "### 3a. Critical API Limits\n",
                                     "\n",
                                     "Design schemas within the hard limits for efficient grammar compilation:\n",
                                     "\n",
                                     "- Maximum **20** strict tools per request.\n",
                                     "- Maximum **24** optional parameters across all strict schemas in a single request.\n",
                                     "- Maximum **16** parameters using union types such as `anyOf` or `[\"string\", \"null\"]`.\n",
                                     "- Compiled grammars are cached for **24 hours** from last use.\n",
                                     "- Changing schema structure invalidates the grammar cache and reintroduces initial latency.\n",
                                     "\n",
                                     "## 4. Advanced Schema Design \u0026 Safety\n",
                                     "\n",
                                     "Safe schemas prevent hallucinations and API errors in multi-step workflows.\n",
                                     "\n",
                                     "### 4a. Hallucination Prevention with Nullable Fields\n",
                                     "\n",
                                     "Use nullable fields, such as `[\"string\", \"null\"]`, for information that may legitimately be missing from a source document. This lets the model return `null` instead of fabricating placeholder values to satisfy a required field.\n",
                                     "\n",
                                     "*Prospect Profile schema excerpt*\n",
                                     "```json\n",
                                     "{\n",
                                     "  \"type\": \"object\",\n",
                                     "  \"properties\": {\n",
                                     "    \"company_name\": {\"type\": \"string\"},\n",
                                     "    \"annual_revenue\": {\"type\": [\"string\", \"null\"]},\n",
                                     "    \"contact_email\": {\"type\": [\"string\", \"null\"]},\n",
                                     "    \"ai_signal\": {\"type\": [\"string\", \"null\"]},\n",
                                     "    \"budget_range\": {\"type\": [\"string\", \"null\"]},\n",
                                     "    \"decision_date\": {\"type\": [\"string\", \"null\"]}\n",
                                     "  },\n",
                                     "  \"required\": [\"company_name\", \"annual_revenue\", \"contact_email\", \"ai_signal\", \"budget_range\", \"decision_date\"],\n",
                                     "  \"additionalProperties\": false\n",
                                     "}\n",
                                     "```\n",
                                     "\n",
                                     "### 4b. The Incompatibility Rule\n",
                                     "\n",
                                     "Citations and Structured Outputs are fundamentally incompatible. Sending both in one request returns a 400 error because citations require interleaved citation blocks, which violate strict JSON schema constraints.\n",
                                     "\n",
                                     "### 4c. Extensible Categorization with `other`\n",
                                     "\n",
                                     "Enums are reliable, but production categories evolve. Use a bounded enum with an `other` escape hatch plus a required detail field so the system stays extensible without losing structure.\n",
                                     "\n",
                                     "```json\n",
                                     "{\n",
                                     "  \"type\": \"object\",\n",
                                     "  \"properties\": {\n",
                                     "    \"category\": {\n",
                                     "      \"type\": \"string\",\n",
                                     "      \"enum\": [\"billing\", \"technical\", \"security\", \"other\"]\n",
                                     "    },\n",
                                     "    \"category_detail\": {\n",
                                     "      \"type\": [\"string\", \"null\"],\n",
                                     "      \"description\": \"Required when category is other; otherwise null.\"\n",
                                     "    }\n",
                                     "  },\n",
                                     "  \"required\": [\"category\", \"category_detail\"],\n",
                                     "  \"additionalProperties\": false\n",
                                     "}\n",
                                     "```\n",
                                     "\n",
                                     "Validation rule: if `category` is `\"other\"`, `category_detail` must explain the specific category. If `category` is a known enum value, `category_detail` should be `null`.\n",
                                     "\n",
                                     "## Lab Exercise: Designing for Deterministic Extraction\n",
                                     "\n",
                                     "**Objective:** master categorical criteria, union-type constraints, prerequisite forcing, and validation loops.\n",
                                     "\n",
                                     "1. **Explicit classification:** create a Lead Quality classification tool using numeric thresholds such as `Revenue \u003e $10M`; include one-line examples for each category.\n",
                                     "2. **Few-shot generalization:** provide 2-4 extraction examples. Each example must include a `Why:` line explaining why specific data was mapped to a field.\n",
                                     "3. **Union-type constraints:** design a Prospect Profile schema with 5 nullable fields and verify missing data returns `null`, not placeholder text.\n",
                                     "4. **Extensible categorization:** design a schema where `category` is an enum that includes `other`; require a separate `category_detail` string when `other` is selected.\n",
                                     "5. **Prerequisite forcing:** use `tool_choice: {\"type\": \"tool\", \"name\": \"extract_metadata\"}` to ensure metadata extraction runs before enrichment.\n",
                                     "6. **Validation-retry loop:** check for a semantic error, such as line items not summing to a total. If validation fails, send a follow-up request with the original document and the specific validation error to guide self-correction.\n",
                                     "\n",
                                     "\u003e **Tip.** Architect Tip for the Exam\n",
                                     "Precision is not \"more prompt.\" It is objective criteria, examples with reasoning, schemas that allow legitimate missingness, strict tool constraints, and validation feedback that tells the model exactly what semantic invariant failed."
                                 ]
                  }
              ],
    "metadata":  {
                     "kernelspec":  {
                                        "display_name":  "Python 3",
                                        "language":  "python",
                                        "name":  "python3"
                                    },
                     "language_info":  {
                                           "codemirror_mode":  {
                                                                   "name":  "ipython",
                                                                   "version":  3
                                                               },
                                           "file_extension":  ".py",
                                           "mimetype":  "text/x-python",
                                           "name":  "python",
                                           "nbconvert_exporter":  "python",
                                           "pygments_lexer":  "ipython3",
                                           "version":  "3.11.0"
                                       }
                 },
    "nbformat":  4,
    "nbformat_minor":  5
}
