Module 8

Prompt Engineering for Precision: Criteria, Constraints, and Structured Output

This module provides the advanced prompting techniques required for production-grade extraction and classification, ensuring that Claude's routing and output match a target schema 100% of the time. The goal is to move from fuzzy instructions to deterministic precision: explicit criteria, few-shot reasoning, strict schema constraints, hallucination-resistant nullable fields, and validation-retry loops.

Answer key Module8_Complete.ipynb

1. Explicit Categorical Criteria vs. Vague Instructions

Vague instructions like "be conservative" or "only report high-confidence findings" fail because they do not define objective thresholds. Replace adjectives with numeric or behavioral criteria.

Anti-pattern
Classify lead quality as low, medium, or high. Be conservative.
Precise criteria
HIGH
- Revenue is greater than $10M, AND
- Company has an active AI initiative or open data/ML roles.
Example: "$25M revenue and hiring an ML platform lead."

MEDIUM
- Revenue is $2M-$10M, OR
- Revenue is unknown but company has a clear AI adoption signal.
Example: "$6M revenue and recently published an AI case study."

LOW
- Revenue is below $2M, OR
- No AI adoption signal is present.
Example: "$900k revenue and no technical hiring signal."

If evidence supports two categories, choose the higher category.
If revenue and AI signal are both missing, output needs_clarification.

1a. Behavioral Thresholds and Escape Hatches

Behavioral thresholds make classification reproducible: "production is down for all users" is stronger than "severe outage." Escape hatches like needs_clarification prevent the model from forcing a fuzzy input into a false category.

2. Few-Shot Examples for Generalization

Few-shot examples are the best way to handle ambiguous cases where two tools, fields, or categories look reasonable.

2a. The Reasoning Column

High-quality few-shot examples include a Why: line. The reasoning teaches the rule behind the pattern, so the model generalizes to novel cases rather than merely matching examples.

Few-shot extraction examples
Example 1
Input: "Acme reports $14M ARR and is hiring a VP of AI Platform."
Output: {"lead_quality": "HIGH", "annual_revenue": "$14M", "ai_signal": "hiring VP of AI Platform"}
Why: Revenue is greater than $10M and the hiring signal confirms active AI investment.

Example 2
Input: "BetaCo launched a chatbot pilot, but revenue is not disclosed."
Output: {"lead_quality": "MEDIUM", "annual_revenue": null, "ai_signal": "chatbot pilot"}
Why: Revenue is missing, but a clear AI adoption signal qualifies for MEDIUM rather than LOW.

Example 3
Input: "Gamma LLC has $800k revenue and no AI-related hiring or projects."
Output: {"lead_quality": "LOW", "annual_revenue": "$800k", "ai_signal": null}
Why: Revenue is below $2M and there is no AI signal.

2b. Negative Triggers

Include examples that distinguish acceptable patterns from genuine issues. Negative examples reduce false positives by teaching what not to flag or extract.

3. Structured Output Pillars & Technical Constraints

Structured outputs constrain Claude to a specific JSON schema, eliminating syntax errors such as trailing commas, missing fields, and surprise keys.

  • JSON outputs (output_config.format): constrain the final answer to a JSON schema.
  • Strict tool use (strict: true): guarantees tool inputs follow your schema exactly through grammar-constrained sampling.

3a. Critical API Limits

  • Maximum 20 strict tools per request.
  • Maximum 24 optional parameters across all strict schemas in a single request.
  • Maximum 16 parameters using union types such as anyOf or ["string", "null"].
  • Compiled grammars are cached for 24 hours from last use.
  • Changing schema structure invalidates the grammar cache and reintroduces initial latency.

4. Advanced Schema Design & Safety

Safe schemas prevent hallucinations and API errors in multi-step workflows.

4a. Hallucination Prevention with Nullable Fields

Use nullable fields, such as ["string", "null"], for information that may legitimately be missing from a source document. This lets the model return null instead of fabricating placeholder values to satisfy a required field.

JSON (Prospect Profile schema excerpt)
{
  "type": "object",
  "properties": {
    "company_name": {"type": "string"},
    "annual_revenue": {"type": ["string", "null"]},
    "contact_email": {"type": ["string", "null"]},
    "ai_signal": {"type": ["string", "null"]},
    "budget_range": {"type": ["string", "null"]},
    "decision_date": {"type": ["string", "null"]}
  },
  "required": ["company_name", "annual_revenue", "contact_email", "ai_signal", "budget_range", "decision_date"],
  "additionalProperties": false
}

4b. The Incompatibility Rule

Citations and Structured Outputs are fundamentally incompatible. Sending both in one request returns a 400 error because citations require interleaved citation blocks, which violate strict JSON schema constraints.

4c. Extensible Categorization with other

Enums are reliable, but production categories evolve. Use a bounded enum with an other escape hatch plus a required detail field so the system stays extensible without losing structure.

JSON (category schema)
{
  "type": "object",
  "properties": {
    "category": {
      "type": "string",
      "enum": ["billing", "technical", "security", "other"]
    },
    "category_detail": {
      "type": ["string", "null"],
      "description": "Required when category is other; otherwise null."
    }
  },
  "required": ["category", "category_detail"],
  "additionalProperties": false
}

Validation rule: if category is "other", category_detail must explain the specific category. If category is a known enum value, category_detail should be null.

Lab Exercise: Designing for Deterministic Extraction

Self-driven lab Module8_Self_Driven_Lab.ipynb

Objective: master categorical criteria, union-type constraints, prerequisite forcing, and validation loops.

  1. Explicit classification: create a Lead Quality classification tool using numeric thresholds such as Revenue > $10M; include one-line examples for each category.
  2. Few-shot generalization: provide 2-4 extraction examples. Each example must include a Why: line explaining why specific data was mapped to a field.
  3. Union-type constraints: design a Prospect Profile schema with 5 nullable fields and verify missing data returns null, not placeholder text.
  4. Extensible categorization: design a schema where category is an enum that includes other; require a separate category_detail string when other is selected.
  5. Prerequisite forcing: use tool_choice: {"type": "tool", "name": "extract_metadata"} to ensure metadata extraction runs before enrichment.
  6. Validation-retry loop: check for a semantic error, such as line items not summing to a total. If validation fails, send a follow-up request with the original document and the specific validation error to guide self-correction.
Architect Tip for the Exam

Precision is not "more prompt." It is objective criteria, examples with reasoning, schemas that allow legitimate missingness, strict tool constraints, and validation feedback that tells the model exactly what semantic invariant failed.