Module 2

Agentic Loops & SDK Foundations

This module covers the lower-level orchestration required for production Claude systems: the Messages API and how tool results re-enter conversation history, the Agent SDK for model-driven workflows, Zero Data Retention (ZDR) eligibility per feature, and the agentic loop pattern that governs every multi-turn workflow.

Answer key Module2_Complete.ipynb
Prerequisites: If you haven't installed VS Code, Jupyter, and the Anthropic SDK yet, complete Module 1: Dev Environment & Foundations before continuing.

1. Setting Up the Messages API (Stateless Design)

The Claude API is stateless, it does not persist history between calls. To build a conversation, you must maintain a local array of messages and send the full history back to the API with every request. The same pattern carries through tool use: when Claude requests a tool, you append the assistant's tool_use block, then append a tool_result block on the next turn so the model can reason about what the tool returned.

Implementation Task: Local Message History

Start by initializing your conversation locally and capturing the assistant's reply.

Python
import anthropic
from dotenv import load_dotenv

load_dotenv()  # reads ANTHROPIC_API_KEY from your .env file

client = anthropic.Anthropic()

# Start with your local message history
messages = [
    {"role": "user", "content": "Analyze the target audience for an AI consulting firm."}
]

# Create the first request
# Note: Adaptive thinking is required for Opus 4.7
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive"},  # Required for Opus 4.7; replaces budget_tokens
    inference_geo="us",              # Restrict compute to US infrastructure (1.1x pricing)
    messages=messages
)

# Build the synthetic conversation locally by adding the assistant's response
messages.append({"role": "assistant", "content": response.content})

# Add a manual follow-up to the local history
messages.append({"role": "user", "content": "Now, generate three LinkedIn post ideas for this audience."})

Implementation Task: Feeding Tool Results Back Into History

When stop_reason == "tool_use", the assistant's last message contains one or more tool_use content blocks. You execute each tool locally, then send the output back as a tool_result block on a user turn. Re-calling messages.create with the updated history lets the model read the tool output and decide the next action, this is the spine of every agentic loop.

Python
# Assume the prior `response` came back with stop_reason == "tool_use"
# 1. Append the assistant turn verbatim, including its tool_use block(s).
messages.append({"role": "assistant", "content": response.content})

# 2. Locate the tool_use block and run the tool in your own code.
tool_use = next(b for b in response.content if b.type == "tool_use")
tool_output = run_local_tool(tool_use.name, tool_use.input)  # your dispatcher

# 3. Append the tool_result on a USER turn, keyed by the tool_use_id.
messages.append({
    "role": "user",
    "content": [{
        "type": "tool_result",
        "tool_use_id": tool_use.id,
        "content": tool_output,
    }],
})

# 4. Re-call the API so Claude can reason about the tool output and decide next steps.
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    messages=messages,
)
Architect Tip for the Exam

Tool results are user-role messages, not a separate role. The tool_use_id on the result must match the id from the assistant's tool_use block, and you must include the assistant's full content (including any thinking blocks) in history before the result, or the next call will reject the conversation.

2. Designing Agentic Workflows with the Agent SDK

Before writing any code, decide which type of system you actually need. The exam draws a sharp line between two architectures:

  • Workflows (pre-configured decision trees), You hard-code the control flow. The model fills in the leaves (classify, summarize, extract), but humans pick the branches. Predictable, cheap, easy to test, no autonomy.
  • Agents (model-driven decision-making), You expose tools and a goal. The model decides which tool to call, when to stop, and how to recover. Higher cost and variance, but able to handle open-ended tasks where you can't enumerate the branches in advance.

Pick the workflow when the steps are knowable in advance. Reach for an agent only when the model genuinely needs to choose its own path.

Implementation Task: Stand Up the Agent SDK

The Claude Agent SDK (claude-agent-sdk) is Anthropic's reference harness for the agentic loop. It owns history bookkeeping, tool dispatch, and termination so you can focus on the system prompt, allowed tools, and permission policy.

Bash
pip install claude-agent-sdk
Python
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions

# Configure the agent: persona, model, and the tool surface it may use.
options = ClaudeAgentOptions(
    model="claude-opus-4-7",
    system_prompt=(
        "You are an expert Marketing Strategist. Use thinking to ensure "
        "your plans are data-driven, and cite sources when you research."
    ),
    allowed_tools=["WebSearch", "WebFetch"],
    permission_mode="acceptEdits",  # Auto-approve safe tool calls
)

async def main():
    # The SDK runs the agentic loop for you: tool dispatch, history, stop_reason.
    async for message in query(
        prompt="Generate three LinkedIn post ideas for AI consulting buyers.",
        options=options,
    ):
        print(message)

asyncio.run(main())
Architect Tip for the Exam

The Agent SDK does not replace the Messages API, it sits on top of it. Knowing how to drop down to raw messages.create with your own tool_use / tool_result bookkeeping (Section 1) is what lets you debug an SDK-driven agent when it misbehaves.

3. Data Residency & ZDR Eligibility

The inference_geo parameter controls where model computation runs on a per-request basis.

  • "us", Inference is restricted to US-based infrastructure only.
  • "global", Default. Runs in any available geography for best performance.

Setting inference_geo: "us" incurs a 1.1x pricing multiplier on all token categories (input, output, and cache) for models starting with Claude Opus 4.6.

ZDR Eligibility, Feature by Feature

Zero Data Retention is not an account-level switch, it is decided per feature. The exam expects you to recognize which features keep a workload ZDR-compliant and which ones break it.

ZDR-eligible NOT ZDR-eligible
  • Adaptive Thinking
  • Structured Outputs
  • Data Residency (inference_geo)
  • Standard Messages API calls
  • MCP Connector
  • Message Batches API
  • Files API
  • Prompt Caching (server-stored)
Architect Tip for the Exam

If a regulated workload must stay ZDR, you cannot reach for the MCP Connector or Message Batches just because they are convenient, you have to fall back to direct tool calls and synchronous Messages requests. Conversely, Adaptive Thinking and Structured Outputs are safe to layer on without breaking ZDR.

4. Orchestration Pattern: The Agentic Loop

For the exam, you must understand that the stop_reason field determines the loop's behavior:

  • "tool_use", The loop continues. Execute the tool, append the tool_result, and re-call the API.
  • "end_turn", The loop terminates. This is the only reliable signal that the task is complete.
  • "max_tokens" / "pause_turn", Recoverable interruptions, continue the request rather than treating them as completion.

Anti-Patterns to Avoid

  • Parsing natural-language completion signals, Do not check whether the assistant text contains "I'm done", "task complete", or similar phrases. Those strings are not part of the API contract, they will drift across model versions and prompts and silently break your loop.
  • Using an iteration cap as the primary stopping mechanism, Writing for _ in range(10): as the loop body and exiting whenever the counter runs out hides bugs (the agent looked complete but actually got cut off) and bills you for tool calls that never finish a task. Iteration caps belong in the loop as a safety guardrail against runaway agents, not as the signal that the work is done.
Architect Tip for the Exam

The correct pattern: drive the loop off stop_reason == "end_turn", keep an iteration cap as a defensive backstop, and log a hard error (rather than silently returning) if the cap is what stopped the run. That separation, structured signal for completion, counter for safety, is what the exam is testing.

Lab Exercise: Manual History Management vs. Agent SDK

Self-driven lab Module2_Self_Driven_Lab.ipynb

Objective: master the stateless nature of history bookkeeping and the transition to model-driven orchestration.

  1. History bookkeeping: create a manual message array. Send a three-turn conversation where you manually append each user and assistant message. Verify that omitting one assistant turn causes the API to reject the request.
  2. Tool result mapping: simulate a tool_use event. Manually append the tool_use block to history, followed by a tool_result block. Ensure the tool_use_id matches exactly.
  3. SDK transition: refactor the same loop using the Claude Agent SDK. Observe how the SDK handles history, tool dispatch, and termination automatically.
  4. Loop safety: implement an iteration cap of 5 turns as a defensive guardrail. Log a hard error if the cap is reached before stop_reason: "end_turn".