Module 11

Advanced Decomposition: Durability, Context Management, and Provenance

This module covers the advanced logic needed to keep long-running research sessions accurate as they fill the model's context window: managing and summarizing context, recovering from crashes with state manifests, handling subagent failures intelligently, and preserving provenance so that findings stay attributable to their sources.

Answer key Module11_Complete.ipynb

1. Managing Long-Horizon Context

As sessions fill the model's context window, they can suffer from context rot, where the model loses focus on early instructions or misses information in the middle of long inputs, the "lost in the middle" effect.

Recognize the symptoms before they compound. A degraded session starts answering from typical patterns instead of the specific classes, files, or facts it discovered earlier in the conversation: it invents a plausible-sounding method name instead of the one it read in turn 4, or gives inconsistent answers to the same question late in the session.

Apply the remedies in this order:

Trim verbose tool outputs before they enter history. A 40-field order lookup where only 5 fields matter should be reduced to those 5 fields at the tool boundary. Every irrelevant field you keep out of the transcript delays degradation.
Summarize completed phases and carry the summary forward in place of the full transcript of that phase. In Claude Code, use the /compact command when exploration output has filled the window: it replaces the conversation with a summary and continues from there.
Start a fresh session with a structured summary when prior tool results are stale or the session has already degraded. Injecting a clean, structured summary of decisions and findings into a new session is more reliable than continuing to push a degraded one forward.
Use scratchpad files. Have agents write key findings to a file (for example findings.md) as they work, and reference that file for subsequent questions. Findings written to disk survive context boundaries, summarization, and even session restarts.

Exam tip: the ordering matters. Trimming is preventive and cheap; summarization recovers space mid-session; a fresh session with an injected summary is the reset button when the session has already degraded; scratchpad files make all three safer because the facts no longer live only in the transcript.

1a. Position-Aware Input Ordering

To mitigate the "lost in the middle" effect within a single large input, place key findings summaries at the beginning of aggregated inputs, organize detailed results with explicit section headers, and repeat the most critical facts at the end.

Aggregated input layout

<key_findings>
- Max Budget: $50k
- Current Competitor: ContosoAI
- Decision Date: 2026-09-15
</key_findings>

## Research Detail: Budget
...

## Research Detail: Competitor
...

Do not bury critical facts between long raw excerpts. Put the summary first, then the detailed evidence, and restate the key facts at the end when the input is very long.

2. The Case Facts Pattern

Progressive summarization erodes precise, non-negotiable details: confirmed budget figures, meeting dates, order numbers, and the customer's stated expectations get paraphrased away one summary at a time.

Extract those critical facts into a case_facts block that is re-included in every prompt, outside any summarized history. Because it travels with each request rather than living inside the transcript, it is never subject to summarization.

Case facts block (re-sent with every prompt)

<case_facts>
- Max Budget: $50k
- Renewal Decision Date: 2026-09-15
- Order Number: ORD-33412
- CRM Account ID: acct_7781
- Stated Expectation: refund processed before renewal date
</case_facts>

Rule: case_facts is never summarized. However aggressively the conversational narrative is compressed, the hard data arrives verbatim on every turn.

3. Decomposition: Fixed Pipelines vs. Adaptive Plans

Orchestrators must choose between predictable efficiency and dynamic durability.

Prompt chaining (fixed sequential): a hard-coded sequence such as research → draft → review. It is predictable and fast but brittle. If research surfaces a deal-breaker, the system may still attempt to draft a useless report.
Adaptive decomposition (dynamic): the coordinator inspects each subagent's output and re-plans the next steps based on findings. If research finds a prospect already uses a competitor, the coordinator skips the standard pitch and spawns a competitor-positioning subagent instead.

Adaptive systems cost more coordinator turns, but they are the durable pattern for open-ended work.

4. The Subagent Failure-Report Contract

A coordinator's ability to recover is bounded by the quality of the failure report it receives from a subagent.

Subagents should return a structured contract instead of generic prose errors.

JSON (coordinator-level failure report)

{
  "status": "failed",
  "failure_type": "TRANSIENT",
  "partial_results": [
    "Found pricing page",
    "Confirmed enterprise plan exists"
  ],
  "recommended_next_step": "Retry competitor review after rate limit clears."
}

failure_type: an enum such as TRANSIENT or VALIDATION, allowing the coordinator to branch without re-reading prose.
partial_results: lets the coordinator salvage what worked instead of wasting the entire run.
recommended_next_step: the subagent's local assessment of what it learned about the failure.

Outer-loop signal: isRetryable from Module 9 is a tool-level signal. This structured report is the coordinator-level signal for orchestrating complex recoveries.

4a. Coverage Annotations in the Final Synthesis

Salvaging partial results is only half the contract. The final synthesized report must state which findings are well-supported and which topic areas have gaps because a source was unavailable or a subagent failed.

A report that silently omits a failed subtopic reads as complete when it isn't. The reader has no way to distinguish "we checked and found nothing" from "we never checked." Require the coordinator to carry each failure report forward into an explicit coverage annotation, for example: "Competitor pricing could not be verified; the pricing source timed out. Findings in this section are based on cached data from the prospect's own materials."

5. Crash Recovery with State Manifests

A long-running multi-agent workflow that keeps all of its state in conversation history loses everything when the process crashes. The fix is the same discipline as the scratchpad pattern, made structured: each agent exports its state to a known location as it works: findings so far, files analyzed, and open questions.

JSON (state manifest)

{
  "agent": "pricing-researcher",
  "updated_at": "2026-07-06T14:32:00Z",
  "findings": [
    "Enterprise tier confirmed at $48k/yr",
    "Discount requires 2-year commitment"
  ],
  "files_analyzed": ["pricing.html", "terms.pdf"],
  "open_questions": ["Does the discount apply to renewals?"]
}

On crash or restart, the coordinator loads each manifest and injects the summaries into fresh agent prompts. The new agents resume from recorded findings and open questions instead of re-running every search and re-reading every file.

Exam tip: manifests are written continuously as the agent works, not at shutdown; a crash by definition gives you no shutdown hook.

6. Provenance: Claim-Source Mappings

When subagent findings are compressed into summaries without claim-source mappings, attribution is lost and the final report can no longer say where any number came from. Require every research subagent to output structured mappings (claim, evidence excerpt, source, and publication date) and require the synthesis step to preserve and merge them rather than flattening findings into prose.

JSON (claim-source mapping)

{
  "claim": "Enterprise plan costs $48k/yr",
  "evidence_excerpt": "Enterprise: $4,000/mo billed annually",
  "source": "https://contoso.example.com/pricing",
  "source_name": "ContosoAI pricing page",
  "publication_date": "2026-05-14"
}

Conflicting statistics from credible sources: never arbitrarily pick one. Annotate the conflict with both values and their attribution, and let the coordinator decide how to reconcile them before synthesis.
Temporal data: require publication or collection dates on every claim. A 2023 figure next to a 2026 figure with dates attached reads as a time series; without dates it reads as a contradiction.
Separate established from contested: the report should distinguish well-established findings (multiple agreeing sources) from contested ones (annotated conflicts), so readers can weigh them appropriately.
Render content appropriately: financial data as tables, news and narrative context as prose, technical findings as structured lists. Format is part of faithful reporting.

Exam tip: a synthesis that quietly resolves a source conflict has destroyed information. The correct behavior is to surface both values with attribution and dates, and escalate the reconciliation decision to the coordinator.

Lab Exercise: Managing a Long-Horizon Research Session

Self-driven lab Module11_Self_Driven_Lab.ipynb

Objective: master tool-output trimming, the case facts pattern, scratchpad continuity, failure-report contracts, and provenance preservation.

Trim: intercept a verbose tool result (a 40-field order lookup) and keep only the relevant fields before it enters history. Late in a long session, compare the agent's answers with and without trimming. Observe when the untrimmed session starts answering from typical patterns instead of the actual data.
Case facts: extract amounts, dates, and IDs into a case_facts block that is included in every prompt. Verify a detail stated in turn 2 (such as Max Budget: $50k) survives verbatim to turn 30, even after the surrounding history has been summarized.
Scratchpad: have the agent maintain a findings.md file while exploring. Kill the session, resume fresh with the scratchpad injected into the new prompt, and verify the agent continues the work without re-running its earlier exploration.
Failure report: simulate a subagent timeout. Verify the coordinator receives the failure type, the attempted query, and partial_results, and that the final report carries a coverage annotation marking the affected topic area as a gap rather than silently omitting it.
Provenance: run two sources with conflicting statistics through the pipeline. Verify both values survive to the final report with attribution and publication dates, annotated as a conflict rather than arbitrarily reconciled.

Exam tip: summarization and fresh-session-with-summary preserve narrative continuity; case_facts preserves exact facts; scratchpads and manifests preserve state across crashes; failure reports with coverage annotations preserve recovery options; claim-source mappings preserve attribution.