Prompt Engineering for Precision: Criteria, Constraints, and Structured Output
This module provides the advanced prompting techniques required for production-grade extraction and classification, ensuring that Claude's routing and output match a target schema 100% of the time. The goal is to move from fuzzy instructions to deterministic precision: explicit criteria, few-shot reasoning, strict schema constraints, hallucination-resistant nullable fields, and validation-retry loops.
1. Explicit Categorical Criteria vs. Vague Instructions
Vague instructions like "be conservative" or "only report high-confidence findings" fail because they do not define objective thresholds. Replace adjectives with numeric or behavioral criteria.
Classify lead quality as low, medium, or high. Be conservative.
HIGH
- Revenue is greater than $10M, AND
- Company has an active AI initiative or open data/ML roles.
Example: "$25M revenue and hiring an ML platform lead."
MEDIUM
- Revenue is $2M-$10M, OR
- Revenue is unknown but company has a clear AI adoption signal.
Example: "$6M revenue and recently published an AI case study."
LOW
- Revenue is below $2M, OR
- No AI adoption signal is present.
Example: "$900k revenue and no technical hiring signal."
If evidence supports two categories, choose the higher category.
If revenue and AI signal are both missing, output needs_clarification.
1a. Behavioral Thresholds and Escape Hatches
Behavioral thresholds make classification reproducible: "production is down for all users" is stronger than "severe outage." Escape hatches like needs_clarification prevent the model from forcing a fuzzy input into a false category.
2. Few-Shot Examples for Generalization
Few-shot examples are the best way to handle ambiguous cases where two tools, fields, or categories look reasonable.
2a. The Reasoning Column
High-quality few-shot examples include a Why: line. The reasoning teaches the rule behind the pattern, so the model generalizes to novel cases rather than merely matching examples.
Example 1
Input: "Acme reports $14M ARR and is hiring a VP of AI Platform."
Output: {"lead_quality": "HIGH", "annual_revenue": "$14M", "ai_signal": "hiring VP of AI Platform"}
Why: Revenue is greater than $10M and the hiring signal confirms active AI investment.
Example 2
Input: "BetaCo launched a chatbot pilot, but revenue is not disclosed."
Output: {"lead_quality": "MEDIUM", "annual_revenue": null, "ai_signal": "chatbot pilot"}
Why: Revenue is missing, but a clear AI adoption signal qualifies for MEDIUM rather than LOW.
Example 3
Input: "Gamma LLC has $800k revenue and no AI-related hiring or projects."
Output: {"lead_quality": "LOW", "annual_revenue": "$800k", "ai_signal": null}
Why: Revenue is below $2M and there is no AI signal.
2b. Negative Triggers
Include examples that distinguish acceptable patterns from genuine issues. Negative examples reduce false positives by teaching what not to flag or extract.
3. Structured Output Pillars & Technical Constraints
Structured outputs constrain Claude to a specific JSON schema, eliminating syntax errors such as trailing commas, missing fields, and surprise keys.
- JSON outputs (
output_config.format): constrain the final answer to a JSON schema. - Strict tool use (
strict: true): guarantees tool inputs follow your schema exactly through grammar-constrained sampling.
3a. Critical API Limits
- Maximum 20 strict tools per request.
- Maximum 24 optional parameters across all strict schemas in a single request.
- Maximum 16 parameters using union types such as
anyOfor["string", "null"]. - Compiled grammars are cached for 24 hours from last use.
- Changing schema structure invalidates the grammar cache and reintroduces initial latency.
4. Advanced Schema Design & Safety
Safe schemas prevent hallucinations and API errors in multi-step workflows.
4a. Hallucination Prevention with Nullable Fields
Use nullable fields, such as ["string", "null"], for information that may legitimately be missing from a source document. This lets the model return null instead of fabricating placeholder values to satisfy a required field.
{
"type": "object",
"properties": {
"company_name": {"type": "string"},
"annual_revenue": {"type": ["string", "null"]},
"contact_email": {"type": ["string", "null"]},
"ai_signal": {"type": ["string", "null"]},
"budget_range": {"type": ["string", "null"]},
"decision_date": {"type": ["string", "null"]}
},
"required": ["company_name", "annual_revenue", "contact_email", "ai_signal", "budget_range", "decision_date"],
"additionalProperties": false
}
4b. The Incompatibility Rule
Citations and Structured Outputs are fundamentally incompatible. Sending both in one request returns a 400 error because citations require interleaved citation blocks, which violate strict JSON schema constraints.
4c. Extensible Categorization with other
Enums are reliable, but production categories evolve. Use a bounded enum with an other escape hatch plus a required detail field so the system stays extensible without losing structure.
{
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": ["billing", "technical", "security", "other"]
},
"category_detail": {
"type": ["string", "null"],
"description": "Required when category is other; otherwise null."
}
},
"required": ["category", "category_detail"],
"additionalProperties": false
}
Validation rule: if category is "other", category_detail must explain the specific category. If category is a known enum value, category_detail should be null.
Lab Exercise: Designing for Deterministic Extraction
Self-driven lab Module8_Self_Driven_Lab.ipynbObjective: master categorical criteria, union-type constraints, prerequisite forcing, and validation loops.
- Explicit classification: create a Lead Quality classification tool using numeric thresholds such as
Revenue > $10M; include one-line examples for each category. - Few-shot generalization: provide 2-4 extraction examples. Each example must include a
Why:line explaining why specific data was mapped to a field. - Union-type constraints: design a Prospect Profile schema with 5 nullable fields and verify missing data returns
null, not placeholder text. - Extensible categorization: design a schema where
categoryis an enum that includesother; require a separatecategory_detailstring whenotheris selected. - Prerequisite forcing: use
tool_choice: {"type": "tool", "name": "extract_metadata"}to ensure metadata extraction runs before enrichment. - Validation-retry loop: check for a semantic error, such as line items not summing to a total. If validation fails, send a follow-up request with the original document and the specific validation error to guide self-correction.
Precision is not "more prompt." It is objective criteria, examples with reasoning, schemas that allow legitimate missingness, strict tool constraints, and validation feedback that tells the model exactly what semantic invariant failed.