Module 4

Multi-Agent Orchestration (Agent SDK)

Module 2 set up a single agent. This module scales it out: a hub-and-spoke architecture in which a coordinator agent spawns specialized subagents through the Claude Agent SDK's Task tool. You'll learn how to spawn subagents in parallel, why their context is isolated by default, and how to compose their outputs back into the coordinator's plan.

Answer key Module4_Complete.ipynb

1. Hub-and-Spoke Architecture

A single agent gets brittle as the task grows: the context window fills, instructions interfere with each other, and one specialist persona has to do the work of five. The hub-and-spoke pattern fixes this by splitting the work across processes:

Coordinator (the hub), owns the user-facing goal, decides what work to delegate, and composes subagent results into the final answer. Usually the higher-capability model (Opus).
Subagents (the spokes), each spawned for a single specialized job (research, drafting, validation), with their own system prompt, tool surface, and isolated message history. Often a cheaper model (Sonnet or Haiku).

The Claude Agent SDK exposes this via the built-in Task tool, when the coordinator emits a Task call, the SDK starts a new subagent process, runs it to completion, and returns its final message back to the coordinator as a tool result.

2. Enabling the `Task` Tool on the Coordinator

The coordinator can only spawn subagents if Task is in its allowed_tools list. Without it, the SDK silently has no way to delegate, the coordinator will try to do everything in its own context.

Python (coordinator setup)

from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition

# Define the subagent personas the coordinator is allowed to spawn.
agents = {
    "researcher": AgentDefinition(
        description="Web research specialist. Use for market intel, prospect lookups, source gathering.",
        prompt=(
            "You are a research analyst. Use WebSearch and WebFetch to gather "
            "primary sources. Return a structured findings document with source attribution."
        ),
        tools=["WebSearch", "WebFetch"],
        model="haiku",
    ),
    "copywriter": AgentDefinition(
        description="Marketing copy specialist. Use for drafting outreach, social posts, landing pages.",
        prompt=(
            "You are a B2B marketing copywriter. Given research findings in your "
            "prompt, produce three distinct copy variants in the brand voice."
        ),
        tools=[],   # no tools, pure generation
        model="sonnet",
    ),
    "synthesizer": AgentDefinition(
        description="Synthesis specialist. Use after research is complete to merge findings and flag gaps.",
        prompt=(
            "You synthesize cited findings into a concise account plan. You may use "
            "verify_fact only for simple high-frequency checks such as public dates "
            "or company names; route complex research gaps back to the coordinator."
        ),
        tools=["verify_fact"],
        model="sonnet",
    ),
}

coordinator_options = ClaudeAgentOptions(
    model="claude-opus-4-7",
    system_prompt=(
        "You are a marketing coordinator. Decompose the user goal into independent "
        "research and drafting jobs, delegate each to the appropriate subagent via "
        "the Task tool, then synthesize their outputs into the final deliverable."
    ),
    # CRITICAL: Task must be in allowed_tools or the coordinator cannot delegate.
    allowed_tools=["Task"],
    agents=agents,
)

3. Spawning Subagents in Parallel

The coordinator can emit multiple Task calls in a single response. The SDK starts each subagent concurrently and returns their results together, so independent work runs in parallel rather than serially. This is the single biggest performance win the pattern offers, two researchers running in parallel finish in roughly the time of one.

Python (driving the coordinator)

import asyncio

USER_GOAL = (
    "Build an outreach kit for AI consulting prospects in EMEA fintech: "
    "research the top 5 firms, then draft tailored intro emails for each."
)

async def run_coordinator():
    async for message in query(prompt=USER_GOAL, options=coordinator_options):
        # The SDK surfaces each Task spawn, each subagent message, and
        # the coordinator's final synthesis through the same async stream.
        print(message)

asyncio.run(run_coordinator())

To make parallelism happen, prompt the coordinator explicitly: "Spawn the research subagents for all five firms in a single response so they execute concurrently." Without that nudge the model often defaults to one-at-a-time delegation.

4. Context Isolation, and Why It's a Feature

Each subagent starts with a fresh, isolated message history. It does not see the coordinator's transcript, prior tool calls, or the user's original goal unless you put that information into its prompt explicitly. This is intentional, it keeps each subagent's context window small and on-task, and it prevents one specialist's noise from polluting another's reasoning.

The consequence: anything the subagent needs to know, you must pass into its prompt as part of the Task call. The most common bug in early hub-and-spoke implementations is the coordinator delegating "draft the email" without including the research findings, the subagent has nothing to work from and hallucinates.

Architect Tip for the Exam

The hand-off contract is: findings → prompt. After a research subagent returns, the coordinator must summarize the relevant fields (company name, recent news, decision maker) into the next Task call's prompt, not assume the drafting subagent can "see" what the researcher found. The exam tests this directly.

Coordinator system-prompt excerpt

When delegating to the copywriter subagent:
1. Quote the relevant research findings verbatim in the Task prompt
   (company name, recent funding, named contact, public AI signal).
2. Restate the brand voice constraints in every Task prompt; do not
   assume the subagent inherits them from your system prompt.
3. Include the original user goal as one line of context, the subagent
   does not see the user's message.

5. Cost Routing Across the Hub and Spokes

Hub-and-spoke is also a cost-control pattern. Route by capability needed:

Coordinator (Opus 4.7), decomposition and synthesis are the high-leverage steps; spend the tokens here.
Research / extraction subagents (Haiku 4.5), fast, cheap, and well-suited to parallel fan-out where you want many concurrent runs.
Drafting / writing subagents (Sonnet 4.6), balance of quality and cost when output length matters more than reasoning depth.

Architect Tip for the Exam

If the coordinator's allowed_tools omits "Task", no delegation can happen, the coordinator will silently absorb all the work into its own context. Conversely, granting subagents access to "Task" lets them spawn their own subagents (recursive delegation), powerful but harder to reason about. Default to a flat hub-and-spoke unless you genuinely need depth.

6. Scoped Cross-Role Tools

Subagents should not inherit the coordinator's whole toolbox, but "no tools" is often too strict. A synthesis subagent frequently needs to verify a date, confirm a company name, or check whether a cited page still exists. Routing every tiny check back through the coordinator adds latency and bloats the coordinator's context.

The pattern is scoped cross-role tooling: give the subagent one narrow tool for the repetitive need, and keep broad discovery tools with the coordinator or research agents.

Python (limited verification tool)

def verify_fact(claim: str, source_url: str) -> dict:
    """Return whether a simple claim is supported by one cited source."""
    # Implementation would fetch exactly source_url and check the claim.
    # It must not perform open-ended web search.
    return {"supported": True, "checked_url": source_url}

synthesizer = AgentDefinition(
    description="Synthesizes research findings and verifies simple cited facts.",
    prompt=(
        "Use verify_fact for simple date/name checks against an already cited URL. "
        "Do not use it for new research. If a claim needs open-ended investigation, "
        "return a gap for the coordinator."
    ),
    tools=["verify_fact"],
    model="sonnet",
)

6a. Tool Count Reliability

More tools are not automatically better. A subagent with 18 loosely related tools has a larger routing surface, more overlapping descriptions, and more chances to pick a plausible but wrong tool. A subagent with 4 role-specific tools is easier for the model to reason about and easier for you to test.

Python (lab task: compare tool surfaces)

too_many_tools = [
    "web_search", "web_fetch", "query_crm", "update_crm", "delete_lead", "send_email",
    "verify_fact", "summarize_pdf", "extract_table", "run_sql", "forecast_pipeline",
    "lookup_contact", "lookup_account", "calendar_check", "create_ticket",
    "read_repo", "write_file", "run_tests",
]

scoped_synthesis_tools = ["verify_fact", "read_resource", "write_report", "escalate_gap"]

Lab Exercise: Parallel Hub-and-Spoke Orchestration

Self-driven lab Module4_Self_Driven_Lab.ipynb

Objective: implement concurrent subagent execution and the findings-to-prompt context isolation pattern.

The Task tool: configure a Coordinator agent with the Task tool in its allowed_tools list.
Parallel spawning: prompt the Coordinator to spawn three research subagents, such as for three different companies, in a single turn. Observe the SDK executing these concurrently.
Context isolation: verify that a subagent cannot see the coordinator's history. Explicitly pass a Case Fact from the coordinator into the subagent's prompt.
Scoped tooling: provide a subagent with a narrow verify_fact tool. Compare its performance and latency against a subagent that must route all fact-checks back through the coordinator.