Module 3

Tool Design & Tool Choice Strategy

Once you decide to build an agent (Module 2), the model is the one picking which tool to call. This module gives you the levers that control that decision: disambiguating tool descriptions, the tool_choice parameter for forcing prerequisites, strict-mode schema validation, and the agentic loop that ties it all together.

Answer key Module3_Complete.ipynb

1. Concepts: Model-Driven Routing

In an agentic system, the model picks which tool to call, that decision is driven by the tool's name, description, schema, and the surrounding system prompt. Workflows hard-code the call site; agents read your tool surface and route. Four levers shape that routing:

  • Tool descriptions are the primary selection mechanism. Claude reads the name and description on each tool to choose. If two tools have overlapping purposes (e.g. analyze_content vs. analyze_document), Claude will misroute, write descriptions that disambiguate exactly when each tool is appropriate.
  • Functional overlap must be removed, not merely explained. If two tools do the same job with different names, rename or split them so each one owns a distinct input shape and outcome.
  • System prompts can override good descriptions. Keyword-sensitive instructions like "for any document, call analyze_document" can drag routing away from a better tool description. Audit prompt rules for accidental keyword traps.
  • tool_choice forces or constrains routing on a per-request basis: require any tool call, require a specific tool, or let the model decide.
  • Strict mode (strict: True) uses grammar-constrained sampling to force tool inputs to match your JSON schema 100%. No trailing commas, no missing required fields, no surprise keys.

1a. Eliminating Functional Overlap

"Tool description" is the most under-engineered part of most agentic systems. A bad description is one that tells Claude what the tool does; a good description tells Claude when to pick it over its siblings. A better design removes the sibling conflict entirely. Compare:

Python (anti-pattern: ambiguous)
# BAD: descriptions overlap, Claude will guess.
tools = [
    {"name": "analyze_content",  "description": "Analyzes content."},
    {"name": "analyze_document", "description": "Analyzes a document."},
]
Python (good: disambiguated)
# GOOD: each description names what the OTHER tool is NOT for.
tools = [
    {
        "name": "analyze_content",
        "description": (
            "Analyzes a single piece of inline text or HTML passed as a string "
            "(blog post body, email draft, social copy). Use this when the caller "
            "has the content in-hand. DO NOT use for files on disk or document IDs, "
            "use analyze_document for those."
        ),
    },
    {
        "name": "analyze_document",
        "description": (
            "Analyzes a stored document referenced by document_id (PDF, Word, "
            "uploaded file). Use this when the caller has only an identifier or "
            "a file path. DO NOT use for raw inline text, use analyze_content."
        ),
    },
]

The pattern: state the input shape, the canonical use case, and an explicit "DO NOT use for X, use sibling_tool instead" clause. That last clause is what eliminates misrouting between similar tools.

Python (lab task: rename the overlapping tool)
# STARTING POINT: both tools sound like generic analysis.
tools = [
    {"name": "analyze_content", "description": "Analyze content and return key points."},
    {"name": "analyze_document", "description": "Analyze a document and return key points."},
]

# TARGET DESIGN: split by function and input source.
tools = [
    {
        "name": "analyze_uploaded_document",
        "description": (
            "Extracts claims and entities from an uploaded PDF/DOCX by document_id. "
            "Use only when the file already exists in document storage. Do not use "
            "for web search results or raw HTML."
        ),
    },
    {
        "name": "extract_web_results",
        "description": (
            "Extracts title, URL, snippet, publication date, and source credibility "
            "from web_search results. Use only after web_search/web_fetch. Do not "
            "use for uploaded files or inline drafts."
        ),
    },
]

1b. Prompt Audit for Keyword-Sensitive Instructions

Even a clean toolbox can misroute if the system prompt contains broad keyword rules. Review instructions for phrases that bind routing to words in the user request instead of the tool's true preconditions.

Prompt audit
Risky: "Whenever the user mentions a document, call analyze_document."
Better: "When the user provides a stored document_id or file path, use analyze_uploaded_document.
When the user asks about search output or fetched web pages, use extract_web_results."

1c. Controlling Routing with tool_choice

Descriptions guide the model's preference; tool_choice overrides it. Three values matter for the exam:

  • {"type": "auto"} (default), Claude decides whether to use a tool, which one, and may also reply with plain text.
  • {"type": "any"}, Claude must emit a tool call (no plain-text reply allowed). Use this when you want a structured response and any tool is acceptable.
  • {"type": "tool", "name": "verify_identity"}, Claude must call the named tool on this turn. Use this to enforce a prerequisite, e.g., identity verification before any account action.
Python (forced prerequisite, then any-tool)
import anthropic
from dotenv import load_dotenv

load_dotenv()  # reads ANTHROPIC_API_KEY from your .env file

client = anthropic.Anthropic()

# Turn 1: force identity verification before anything else can happen.
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    tools=tools,
    tool_choice={"type": "tool", "name": "verify_identity"},
    messages=[{"role": "user", "content": "Refund my last order."}],
)

# ... append assistant tool_use and the tool_result for verify_identity ...

# Turn 2: identity is confirmed. Force a tool-based response (no chit-chat).
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    tools=tools,
    tool_choice={"type": "any"},   # must call SOME tool, model picks which
    messages=messages,
)
Architect Tip for the Exam

The exam loves the prerequisite pattern: tool_choice={"type":"tool","name":...} on the first turn forces a specific call (verify identity, fetch customer, look up account); on subsequent turns you switch to {"type":"any"} or {"type":"auto"}. Combine this with Module 9's hooks for defense in depth, tool_choice nudges the model, hooks guarantee the prerequisite at execution time.

1c. Dynamic Filtering & Strict Mode

  • Dynamic filtering with web_search_20260209 alongside code_execution_20260120 lets Claude write Python that post-processes search results before they enter the context window. Lower token cost, higher signal.
  • Strict mode (strict: True) uses grammar-constrained sampling to force tool inputs to match your JSON schema 100%. No trailing commas, no missing required fields, no surprise keys.
ZDR Compliance Warning

Standard web_search alone is ZDR-eligible. Adding code_execution_20260120 for dynamic filtering makes the entire request non-ZDR because execution state is held server-side. Communicate this trade-off to regulated clients before enabling it.

Architect Tip for the Exam

The web_search_20260209 tool version activates dynamic filtering, the older version does not. Always check the tool type version string, it is a direct exam signal for which capabilities are available.

2. Defining the Toolbox

Both web_search and code_execution are server-side built-ins, Anthropic runs them and feeds the results back to Claude. add_lead_to_crm is a client-side custom tool: when Claude calls it, your code is responsible for executing the action and returning the result.

Apply the disambiguation rule from Section 1a: the add_lead_to_crm description states the canonical use case ("a validated prospect") and the precondition ("after industry research is complete"), so Claude won't fire it before web search has run. Note also the explicit name field on every entry, code_execution_20260120 requires name even though it's a built-in (omitting it returns a 400 BadRequestError).

Python
# DEFINE THE TOOLS
# Note: Use 'strict: True' for the CRM tool to guarantee valid JSON.
# Note: 'code_execution' must have a name even if it's a built-in server tool.

tools = [
    {
        "type": "web_search_20260209",
        "name": "web_search",
        "max_uses": 5,
        "allowed_domains": ["gartner.com", "forrester.com", "hbr.org"],
    },
    {
        "type": "code_execution_20260120",
        "name": "code_execution"   # CRITICAL: required alongside type
    },
    {
        "name": "add_lead_to_crm",
        "description": "Adds a validated prospect to the enterprise CRM. Use only after industry research is complete.",
        "strict": True,
        "input_schema": {
            "type": "object",
            "properties": {
                "company": {"type": "string"},
                "contact_email": {"type": "string", "format": "email"},
                "budget_range": {"type": ["string", "null"]},   # Nullable to prevent hallucinations
            },
            "required": ["company", "contact_email", "budget_range"],
            "additionalProperties": False,
        },
    },
]

3. The Initial Request & stop_reason

Send the user's prospecting goal with the toolbox attached. Don't print just the final text, inspect stop_reason and walk the content blocks. On the first turn you should see tool_use, not end_turn.

Python
# THE INITIAL REQUEST
# On Opus 4.7, 'thinking' is required for complex tasks like this.
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=8192,
    thinking={"type": "adaptive"},
    tools=tools,
    messages=[{
        "role": "user",
        "content": "Find the top 5 AI consulting prospects in EMEA finance and add them to my CRM.",
    }],
)

# SHOW OUTPUT
print(f"Stop Reason: {response.stop_reason}")   # Expected: 'tool_use'
for block in response.content:
    if block.type == "thinking":
        print("Reasoning Signature Detected.")
    if block.type == "tool_use":
        print(f"Claude requested tool: {block.name}")
        print(f"Parameters: {block.input}")

4. Handling the Agentic Loop

Web search runs server-side, Anthropic executes it and feeds the results back to Claude inside the same request. Your CRM write is client-side, your code must execute it and return a tool_result block before Claude can finish its turn.

  • Turn 1: Claude calls web_search. The API runs it server-side and silently injects results into Claude's context.
  • Turn 2: Claude reasons over the results and emits a tool_use block for add_lead_to_crm.
  • Turn 3: your application performs the database write and sends a tool_result block back. Claude generates the final end_turn response.
Python (client-side leg)
def handle_tool_use(block):
    if block.name == "add_lead_to_crm":
        # crm.write(block.input)  # your client-side action
        return {
            "type": "tool_result",
            "tool_use_id": block.id,
            "content": "ok: lead written",
        }
    raise ValueError(f"unhandled client-side tool: {block.name}")

# Loop until Claude is done.
while response.stop_reason == "tool_use":
    tool_results = [handle_tool_use(b) for b in response.content if b.type == "tool_use"]
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=8192,
        thinking={"type": "adaptive"},
        tools=tools,
        messages=[
            {"role": "user", "content": "Find the top 5 AI consulting prospects ..."},
            {"role": "assistant", "content": response.content},
            {"role": "user", "content": tool_results},
        ],
    )

5. Final Verification & Citations

Once the loop terminates with end_turn, inspect the final text blocks. Web search automatically attaches citations to grounded claims, verify them so you can audit the sources Claude relied on.

Python
# FINAL OUTPUT VERIFICATION
# Web search automatically provides citations.
for block in response.content:
    if block.type == "text":
        print("Final Answer:", block.text)
        if hasattr(block, "citations") and block.citations:
            print("Sources:", [c.url for c in block.citations])
Architectural Summary
  • Safety: nullable types in the schema (["string", "null"]) prevent Claude from inventing a budget figure when one isn't in the search results.
  • Efficiency: code_execution_20260120 is free when bundled with web_search or web_fetch, so dynamic filtering costs nothing extra in token generation.
  • Compliance: that same bundling makes the request non-ZDR. Standard web search alone stays ZDR-eligible.
  • Selection: tool descriptions, not names alone, drive Claude's routing. Disambiguate any tools whose purposes overlap.

Lab Exercise: Disambiguation & Forced Prerequisites

Self-driven lab Module3_Self_Driven_Lab.ipynb

Objective: eliminate functional overlap between tools and enforce mandatory sequences using tool_choice.

  1. Tool disambiguation: define two tools with overlapping descriptions, such as get_user and query_database. Rewrite their descriptions using the [Input Shape] + [Canonical Use Case] + [Boundary] pattern to eliminate misrouting.
  2. Forced selection: use tool_choice: {"type": "tool", "name": "verify_identity"} to ensure identity verification happens on the first turn.
  3. Strict mode: enable strict: true on a complex JSON schema tool. Intentionally send a malformed input and observe the grammar-constrained sampling enforcement.
  4. Citations vs. JSON: attempt to send a request with both Citations enabled and a strict: true tool. Confirm it returns a 400 BadRequestError and explain why they are mutually exclusive.