Handoff continuity

Subtitle: A technical whitepaper on rich operational handoffs, compaction recovery, and continuity across agent state boundaries

Status: Technical whitepaper draft
Date: 2026-06-21
Author: Clint Bodungen
Project context: MindStone Agent, MindStone-Agent, MS4CC, MS4PI
Companion to: Layered Continuity Architecture for Persistent LLM Agents

Executive summary

Long-running LLM agents eventually cross state boundaries. A context window may compact. A session may end. A role may transfer. A different harness may resume the work. A future model state may need to continue from a point where the previous one had far more live context than the next one will receive.

Most systems handle this with summarization. They compress prior context into a shorter narrative and continue. Summaries are useful, but they are not enough for persistent operational continuity. A summary usually says what happened. A handoff says what the next agent state needs in order to act responsibly.

MindStone uses rich handoffs as operational continuity prompts.

A handoff is not the transcript. It is not durable memory. It is not identity. It is not a checkpoint. It is a navigation aid over preserved history, written for the next agent state at a discontinuity boundary.

A good handoff captures:

current objective;
open threads;
files and projects touched;
decisions made;
active role state;
immediate next actions;
risks, constraints, and non-obvious context;
verified commands and evidence;
explicit non-claims and untested assumptions;
anything the next agent would regret losing.

The core principle:

A summary compresses the past.
A handoff prepares the future.

In MindStone, handoff is the bridge between live context and the next working state. It is especially important when compaction is unavoidable, but it also applies to role transfers, session endings, agent-to-agent coordination, and recovery from interruption.

1. Problem statement

LLM agents do not carry internal state across arbitrary context boundaries. If a model’s live prompt is compacted, truncated, replaced, or restarted, the future model state only knows what the harness supplies.

That creates a continuity hazard.

Before the boundary, the agent may have rich local context:

recent turns;
unresolved tradeoffs;
exact file paths;
partial conclusions;
failed attempts;
subtle user preferences;
constraints discovered through tools;
active role obligations;
warnings about what has not been tested;
local repo status;
private or public context boundaries;
task rhythm and priorities.

After the boundary, much of that texture can disappear.

If the only bridge is a generic summary, the next agent state may resume with a flattened view of the work. It may know the broad story while losing the operational details that prevent mistakes.

MindStone handoff exists to preserve those details in a form the next agent can use immediately.

2. What a handoff is

A handoff is a rich continuity prompt for a future agent state.

It is written at or before a transition boundary:

before compaction;
after a role transfer;
at session end;
before an agent handover;
before a risky interruption;
when context pressure makes loss likely.

A handoff is not just a shorter version of the transcript. It is a task-oriented operational briefing.

It answers:

If I were the next agent state, what would I need to know so I do not waste time, repeat mistakes, overclaim, or lose the thread?

A useful handoff should include enough detail for the next agent to resume work without re-discovering everything, but it should not pretend to replace source records.

In MindStone terms:

The transcript is authoritative.
The handoff is navigational.

3. What a handoff is not

3.1 Not the transcript

The transcript is the full operational record. A handoff is a curated bridge.

If the handoff conflicts with the transcript, source files, or verified local evidence, the source record wins.

3.2 Not durable memory

A handoff may contain durable facts, but it is not the memory store.

Durable memory should be written through checkpoint or consolidation flows, with source review and approval where appropriate.

A handoff can point to memories or identify memories that should be written, but it should not silently become permanent memory by itself.

3.3 Not identity

A handoff can say what role or posture is active, but it is not the agent’s identity file.

Identity belongs in standing context such as IDENTITY.md, USER.md, role files, or equivalent governed artifacts.

3.4 Not a generic summary

A generic summary compresses what happened.

A handoff is operational. It preserves:

current objective;
next actions;
open loops;
constraints;
evidence;
non-claims;
risks;
role state;
“do not lose this” context.

3.5 Not proof of completion

A handoff may describe completed work, but it should be precise about verification.

It should distinguish:

done
built
tested
verified live
pushed
not tested
inferred
pending
blocked

This distinction is central to MindStone claim discipline.

4. Why summaries are not enough

Summarization is optimized for compression. Handoff is optimized for resumption.

A compaction summary may say:

The website docs rendering issue was fixed by changing CSS.

A useful handoff says:

The production docs rendering bug was not just a logo issue. Root cause was Starlight loading marketing global.css, which imports Tailwind. Local dev masked the production CSS order problem. Final fix was to use tokens.css + docs.css in Starlight customCss and keep global.css only in MarketingLayout. Verify deployed docs HTML does not reference /_astro/global.*.css. Do not claim fixed from local build alone.

The difference is operational value.

A summary preserves the headline. A handoff preserves the failure mode, the correction, the verification rule, and the future guardrail.

Common summary failure modes:

omits why a decision was made;
loses uncertainty;
hides what was not tested;
erases failed attempts;
drops exact paths and commands;
compresses open threads into completed-sounding prose;
collapses role obligations;
loses user corrections;
gives stale context false authority.

A handoff should be designed specifically to resist those failures.

5. Handoff in Layered Continuity Architecture

Handoff is a boundary mechanism in Layered Continuity Architecture.

It sits between live context management and future context reconstruction:

current live context
+ transcript history
+ structured memory
+ open task state
+ verification state
        ↓
rich handoff
        ↓
post-boundary agent state
        ↓
resume with continuity

It works with other layers:

Transcript provides the authoritative history.
Structured memory preserves durable facts and lessons.
Recall brings relevant sources back later.
Sliding window reduces the need for emergency compaction.
Compaction may force a boundary where handoff is critical.
Checkpoint governs durable memory/log updates.
Consolidation cycle integrates experience after the boundary.

Handoff is not a replacement for any of those layers. It is the operational bridge that helps the next prompt assemble them correctly.

6. Handoff compared with compaction summarization

Compaction summarization and handoff are often confused because both produce shorter text near a context boundary.

They have different purposes.

6.1 Compaction summarization

Compaction summarization is usually generated by the harness or model to reduce context size.

Typical flow:

large live context
→ generated summary
→ summary replaces older context
→ session continues

Useful properties:

automatic or semi-automatic;
reduces prompt size;
enables continuation under context limits;
often good enough for simple tasks.

Risks:

lossy;
generic;
may omit task-critical details;
may flatten uncertainty;
may not preserve exact evidence;
may not know what future agent state will regret losing;
can become false authority if treated as history.

6.2 Rich handoff

A rich handoff is intentionally written as an operational resumption artifact.

Typical flow:

current objective + open state + evidence + constraints
→ rich handoff
→ compaction/restart/transfer
→ handoff replayed into future context
→ future agent resumes responsibly

Useful properties:

task-oriented;
source-aware;
preserves open loops;
records verification state;
distinguishes done from untested;
names risks and constraints;
includes immediate next actions;
can be reviewed or approved.

Risks:

can become stale;
may be too long;
may omit recent tail if not mechanically refreshed;
may include sensitive details if policy is poor;
may be over-trusted if source authority is not clear.

6.3 Practical comparison

Capability	Compaction summary	Rich handoff
Reduces context	Yes	Yes, indirectly
Main purpose	Compression	Resumption
Authorship	Often harness/model-generated	Agent-authored or policy-authored
Captures next actions	Sometimes	Required
Captures open risks	Usually weak	Required
Captures verification state	Usually weak	Required
Replaces transcript	Must not, but often acts that way	No
Durable memory	No	No
Best fit	Harness compaction mechanics	Continuity across boundary

The short distinction:

Compaction asks: what can we compress?
Handoff asks: what must the next agent not lose?

7. Recommended handoff structure

MindStone handoffs should be predictable. A consistent structure helps future agents find what matters quickly.

Recommended sections:

## Current objective

## Current state summary

## Projects and files touched recently

## Decisions made

## Open threads

## Active role state

## Immediate next actions

## Things post-compaction Slate would regret losing

## RECENT TAIL (since rich handoff)

The names can vary by substrate, but the functions should remain.

7.1 Current objective

What is the agent trying to accomplish right now?

This should be short and action-oriented.

Bad:

We discussed the website and MindStone-Agent.

Better:

Resume MindStone-Agent MVP validation after website docs/video fixes; next likely task is live Pi-session prompt/stream validation.

7.2 Current state summary

What has already been done, and what is the current verified state?

This section should distinguish:

implemented;
built;
tested;
live verified;
pushed;
checkpointed;
still pending.

7.3 Projects and files touched

List exact paths and repos.

Examples:

/Users/clint/Projects/MindStone-Website/astro.config.mjs
/Users/clint/Projects/MindStone-Agent/README.md
/Users/clint/.pi/agent/mindstone/orchestrator/transcripts/.handoff.md

This prevents the next agent from wasting context rediscovering where the work happened.

7.4 Decisions made

Capture decisions separately from events.

Events say what happened. Decisions say what should constrain future work.

Example:

Starlight docs must not import marketing global.css/Tailwind. Use docs.css only.

7.5 Open threads

List unfinished work and drift.

This is where a handoff should resist the common summary failure of making everything sound complete.

7.6 Active role state

If a role is adopted, name it. If no role is active, say so.

Role state matters because standards and expected artifacts may differ.

7.7 Immediate next actions

Provide concrete commands or file paths when useful.

Example:

MINDSTONE_PI_SESSION_LIVE=1 \
MINDSTONE_PI_SESSION_LIVE_MODEL='openai-codex/openai-codex/gpt-5.4-mini' \
  npm run smoke:pi-session-live

7.8 Things the next agent would regret losing

This is the most important section.

It should capture the non-obvious details most likely to prevent future mistakes:

root causes;
verification rules;
untracked files not to commit;
user corrections;
exact caveats;
sensitive boundaries;
false leads;
known bad assumptions;
live validation gaps.

7.9 Recent tail

A rich handoff can become stale between the moment it is written and the moment compaction actually fires.

MindStone solves this with a mechanically managed recent-tail section:

## RECENT TAIL (since rich handoff)

The rich body is curated. The recent tail can be refreshed by a hook from the live transcript shortly before compaction.

This combines:

human/agent-authored continuity structure
+ mechanical last-minute transcript tail

The result is more robust than either one alone.

8. Runtime sequence in MS4PI

MS4PI uses Pi’s native compaction lifecycle plus MindStone handoff discipline.

Default policy:

checkpoint warning: 85%
compaction target: 92%
keep recent tokens: 20000
emergency auto handoff: false

High-level sequence:

1. Monitor context usage.
2. At warning threshold, prompt Slate to draft checkpoint + rich handoff.
3. Slate asks Clint for approval before writing durable memory/log/handoff content.
4. Approved LOG and memory updates are written.
5. Approved .handoff.md is written.
6. Transcript archive/backfill/status verification runs.
7. Before native compaction, a hook archives transcript and refreshes RECENT TAIL.
8. Native Pi compaction runs.
9. After compaction, the handoff is replayed once as critical continuity context.
10. Recall backfill/maintenance can run after the boundary.

Important policy point:

MS4PI does not write rich checkpoint/handoff content without approval by default.

The system can mechanically refresh the recent tail and archive transcript at compaction boundaries, but durable memory and rich handoff content remain governed unless emergency auto-write semantics are explicitly enabled and tested.

9. Runtime sequence in MS4CC

MS4CC had to account for a hard substrate fact:

The agent cannot trigger /compact itself in Claude Code.

Compaction is human-initiated or harness-auto-initiated. Therefore the continuity design cannot depend on the agent firing compaction at exactly the right moment.

The MS4CC pattern became:

1. Configure a warning threshold below compaction.
2. At warning threshold, write a rich handoff and perform checkpoint judgment.
3. Let human or harness compaction happen later.
4. Use PreCompact to archive transcript and refresh RECENT TAIL mechanically.
5. On post-compact session start, replay .handoff.md as critical continuity context.
6. Run deferred embedding/backfill after compaction rather than during the danger zone.

Key design insight:

Continuity rides on handoff + replay, not on who triggers compaction.

That insight generalizes beyond Claude Code. In any substrate where the agent cannot control the boundary, the handoff must be fresh enough and replayable regardless of who or what initiates the transition.

10. Runtime sequence in MindStone-Agent

MindStone-Agent supports both:

sliding_window
auto_compact

Sliding window is the preferred primary mode when MindStone-Agent owns prompt assembly.

Handoff remains important for auto_compact mode and for any substrate boundary that requires summarization or restart.

Current MindStone-Agent behavior includes:

auto_compact_warning transcript events;
auto_compact_required transcript events;
threshold metadata including utilization and reserve-token mapping;
optional emergency handoff writing when policy enables it;
current handoff status reporting in status/doctor surfaces;
one-shot handoff replay by handoff hash/session;
handoff_replayed transcript event;
post_compact_maintenance scaffold event;
explicit substrate compaction coordination result.

A current handoff lives at:

transcripts/.handoff.md

It is current-only and may be overwritten at the next boundary. Durable continuity belongs in:

LOG.md
structured memory
journals
transcripts
indexed sources

The handoff is replayed ephemerally. It should not be automatically indexed or promoted to durable memory.

11. Handoff lifecycle

A robust handoff lifecycle has four phases.

11.1 Draft

The agent drafts a rich handoff from the current work state.

Inputs:

current transcript;
active task;
project files touched;
repo status;
checkpoint state;
memory and LOG state;
user corrections;
verification results;
open threads.

11.2 Approval or policy gate

Depending on substrate and sensitivity, the handoff may require user approval before writing.

MindStone’s default stance is conservative:

Do not write durable or rich continuity artifacts without approval unless policy explicitly allows emergency behavior.

11.3 Write and refresh

The rich handoff is written to the current handoff path.

Before compaction, a hook may mechanically refresh the recent tail from the live transcript so the handoff is not stale.

This separates two jobs:

rich handoff body = curated operational continuity
recent tail = mechanical last-minute freshness

11.4 Replay

After compaction or restart, the handoff is injected once into the next agent state as critical continuity context.

Replay should be tracked by hash/session so the same handoff is not repeatedly injected forever.

The replay should be transcript-visible as an event, but not treated as durable memory by itself.

12. Source authority and handoff trust

Handoffs are useful but fallible.

A handoff should be trusted as a navigation aid, not as final source authority.

Recommended trust order:

current verified local evidence
explicit current user instruction
authoritative transcript/source files
approved structured memory
LOG/checkpoint records
handoff
compaction summary
inference

This order is situational, but the principle is stable:

Handoff helps you find and understand the record.
It is not the record.

If the handoff says a build passed, but no build output is available, the next agent should not overclaim. It should say the handoff reports that a build passed, then verify if the claim matters.

13. Safety and privacy

Handoffs can contain concentrated sensitive context. They may mention files, credentials boundaries, private user context, internal workplace context, operational risks, and uncommitted local state.

A safe handoff system should consider:

where the handoff is stored;
who or what can read it;
whether it will be replayed into public channels;
whether sensitive details need redaction;
whether private context should be marked “do not disclose”;
whether source paths reveal sensitive information;
whether handoff replay should differ by channel.

Important rule:

A handoff may be appropriate for private continuity and inappropriate for public disclosure.

In MindStone, handoff replay is an internal continuity mechanism. It should not be posted to shared channels or external systems unless explicitly authorized.

14. Failure modes and safeguards

14.1 Stale handoff

The handoff was accurate when written but misses recent work before compaction.

Safeguard:

Refresh RECENT TAIL mechanically before compaction.

14.2 Generic summary instead of operational handoff

The artifact says what happened but not what to do next.

Safeguard:

Require objective, open threads, next actions, risks, and regret-losing sections.

14.3 Handoff treated as durable memory

The handoff gets indexed or promoted as if it were a governed memory.

Safeguard:

Replay ephemerally. Durable memory goes through checkpoint/consolidation.

14.4 Handoff treated as authoritative history

The future agent trusts the handoff over transcript or local evidence.

Safeguard:

Prompt rule: handoff is navigation, transcript/source evidence wins.

14.5 Overlong handoff

The handoff becomes so large that it consumes too much post-compact context.

Safeguards:

use headings;
prioritize active objective and next actions;
move durable facts to memory;
keep exact evidence pointers rather than full dumps;
rely on recent tail for only the latest delta.

14.6 Sensitive context leakage

Private details are replayed into the wrong context.

Safeguards:

private storage;
channel-aware replay;
sensitivity markers;
redaction policy;
explicit user authorization for external sharing.

14.7 False completion claims

The handoff says or implies something was done when it was only attempted.

Safeguard:

Separate done, tested, verified, pushed, and pending.

14.8 Boundary race

Compaction fires before the agent writes a handoff.

Safeguards:

warning threshold below compaction threshold;
mechanical PreCompact tail refresh;
current handoff maintained before danger zone;
deferred embed/backfill after compaction;
explicit threshold calibration where possible.

15. Operational guidance

Write a handoff when

compaction is likely;
the current task is complex;
many files or decisions are in flight;
another role/agent will resume;
the session may end before work finishes;
recent context contains important caveats;
live validation status matters;
the next agent would otherwise rediscover too much.

Do not write a handoff when

there is no meaningful state to preserve;
it would duplicate an existing fresh handoff without changes;
the content is too sensitive for the replay context;
the user has not approved a required write;
a checkpoint/memory update is actually the right artifact.

Keep it operational

Use concrete details:

paths
commands
verification results
commit hashes
known warnings
untracked files
model/auth status
thresholds
approval state

Avoid vague prose that sounds polished but does not help the next agent act.

Mark uncertainty

A good handoff should say:

verified
not verified
inferred
reported by user
observed locally
live deployed check passed
not tested
blocked

This is one of the main differences between handoff and generic summary.

16. Evaluation checklist

A handoff system should be evaluated by whether the next agent state can resume responsibly.

Completeness

Does it state the current objective?
Does it include open threads?
Does it name files/projects touched?
Does it capture decisions separately from events?
Does it include immediate next actions?

Accuracy

Are verification claims backed by evidence?
Are untested items clearly marked?
Are paths and commands exact?
Are stale claims avoided?

Resumption quality

Can a fresh agent continue without asking basic rediscovery questions?
Does it avoid repeating known failed approaches?
Does it preserve user corrections and constraints?
Does it identify what not to commit, disclose, or overclaim?

Boundary robustness

Does PreCompact or equivalent refresh the recent tail?
Is handoff replay one-shot or bounded?
Is replay visible in transcript events?
Does post-boundary maintenance run or at least get scaffolded?

Source discipline

Does it distinguish handoff claims from authoritative sources?
Does it point to transcripts, memory files, repos, or commands where needed?
Does it avoid becoming ungoverned durable memory?

Safety

Is sensitive content protected?
Is external sharing controlled?
Is channel-specific replay considered?

17. Reference implementation sketch

A handoff lifecycle can be implemented as follows:

function maybePrepareHandoff(contextUsage, thresholds):
    if contextUsage < thresholds.warning:
        return

    injectDirective(
        "Draft rich handoff and checkpoint bundle. Ask for approval before writing."
    )

After approval:

function writeApprovedHandoff(body):
    assert body includes required headings
    write transcripts/.handoff.md
    record handoff status: path, bytes, sha256, token estimate

Before compaction:

function beforeCompact():
    archiveLiveTranscript()
    tail = extractRecentTailSinceRichHandoff()
    replaceSection("## RECENT TAIL (since rich handoff)", tail)

After compaction or restart:

function beforeNextAgentStart(session):
    handoff = read transcripts/.handoff.md
    if handoff exists and not replayedHash(session, handoff.sha256):
        injectCriticalContext(handoff.text)
        recordTranscriptEvent("handoff_replayed", {
            sha256: handoff.sha256,
            durable: false
        })

Post-boundary maintenance:

function afterHandoffReplay():
    recordTranscriptEvent("post_compact_maintenance", {
        archive: status,
        backfill: status,
        dreamCycle: status
    })

The details vary by substrate, but the design invariants remain:

write before boundary
refresh at boundary
replay after boundary
preserve transcript separately
never treat handoff as durable memory by default

18. Conclusion

Handoff is MindStone’s answer to a specific continuity problem: what must survive when the live agent state crosses a boundary?

A compaction summary compresses old context. A handoff prepares the next state to act.

That difference is central. Persistent agents do not only need to know what happened. They need to know what matters now, what remains unresolved, what was verified, what was not tested, what constraints govern the next action, and what the next agent would regret losing.

MindStone handoffs work because they are embedded in a larger continuity architecture:

append-only transcript
+ structured memory
+ recall
+ sliding window or compaction policy
+ checkpoint/consolidation
+ rich handoff/replay
= recoverable operational continuity

The handoff is not the source of truth. It is the bridge that helps the future agent find and use the source of truth without losing the thread.

Short version

A handoff is a rich operational continuity prompt for the next agent state.

It is not a generic summary.

It works like this:

capture current objective
name open threads
list files/projects touched
record decisions and verification state
identify immediate next actions
preserve risks and regret-losing details
refresh recent tail at compaction
replay once after the boundary

Key distinction:

A summary compresses the past.
A handoff prepares the future.

In MindStone, handoff is the bridge across compaction, restart, role transfer, and other discontinuity boundaries — while transcript, memory, recall, and consolidation remain the durable continuity substrate.