← Writing · No 02 · Operations

When to restart a long-running Claude Code session

By Felix Hellström · Stockholm · 1240 words

When to restart a long-running Claude Code session

Most posts about long-running Claude Code sessions assume the question is how long can I keep this going. After running production agents for the last year, I've come to think the better question is when should I kill this and start fresh. The two questions look similar from the outside. Operationally they couldn't be more different.

A session that has been running for six hours but is healthy and idle is preserving warm context that compounds across the next round of work. A session that has been running for thirty minutes but is retrying the same operation three times is producing degraded output and burning your token budget on its own confusion. The first you keep. The second you kill. Wall clock time doesn't separate them.

Here's the framework I use.

The signal isn't wall clock time

The instinct most people start with is a duration heuristic: "every two hours, restart." This is wrong on both ends.

It's wrong on the long end because idle sessions are free. A long-running manager that has been quiet for an hour isn't burning anything. The conversation tokens are paid for; the model isn't running until you prompt it. Killing it throws away the warm cache of workspace context, file layouts, in-flight specs, partial reasoning about the current arc, that makes the next dispatch fast. Restarting "because it's been a while" is a tax with no benefit.

It's wrong on the short end because a fresh session can drift fast. Three rounds of confused tool use can do more damage to output quality than three days of idle. The signal is what the session is producing, not how long it has been running.

The discipline is to watch for the actual signals of degradation, not to schedule maintenance against the clock.

Four real restart triggers

These are the four states that mean kill and start fresh:

1. Token budget is getting tight. When the conversation is approaching the model's context limit, every new turn is paying a cache miss penalty as old context gets shuffled out. Output quality drops first; cost spikes second. The threshold I use is roughly 70% of the context budget, at that point, summarize the work-in-progress to a markdown file, kill the session, and start a new one with that file as the boot context.

2. Same error three or more times. A session retrying the same failing command is in a loop. The fix is rarely "try again with a tweak." Save the error to your knowledge store, kill the session, and start fresh with the failing state already documented as input. The fresh session sees the failure as data; the drifted session sees it as a personal grudge.

3. The agent is contradicting its own instructions. This one takes practice to spot. The session was given a clear scope (e.g., "only edit files in src/components/") and is now reaching for files outside that scope, or producing output that disagrees with its own earlier reasoning. This is context drift: the agent's model of the task has degraded relative to the instructions. Fresh boot wins; debugging drift loses.

4. You as the operator are the one looping. If you find yourself re-explaining the same constraint, undoing the same kind of mistake, or correcting the same drift across three turns, the agent's effective context has degraded enough that you've become the babysitter. Restart. The fresh session inherits a clean slate; you inherit your time back.

Note what's not on the list: the session has been running a long time, the session has done a lot of work, the session looks idle. Those are not triggers. They are background facts that may or may not coincide with the four signals above.

Why fresh boot beats debugging drift

The math is short.

Restarting a Claude Code session takes about 30 seconds: kill, redeploy, hand it the boot doc. Debugging a drifted session takes 30 minutes if you're lucky, several hours if you aren't, and produces a session that's still drifted at the end of it.

The cost ratio is between 60-to-1 and 600-to-1 in favor of restart. There is essentially no scenario where debugging drift is the right move once the signals are clear.

The only reason this isn't more obvious is that restart feels like throwing away work. It doesn't, if the work is captured properly. Which brings us to the actual hard part of the pattern: the handoff.

The handoff document as the unit of continuity

If the session is the ephemeral thing, the handoff is what's permanent. A good handoff doc lets a fresh session pick up where the dead one left off without re-deriving the context from scratch. A bad handoff means restart is, in fact, throwing away work, and the operator learns to avoid restarting, which compounds the drift.

The shape that works:

## Session summary
- Rounds completed: N
- Key commits: <list>

## In-flight state
- What's running, what's blocked, what's waiting on a decision

## Next actions
- The concrete next steps the fresh session should take

## Context the fresh session needs
- Workarounds discovered, gotchas encountered, decisions made that aren't in the specs yet

The four sections matter. The summary anchors the fresh session in what already happened so it doesn't repeat. The in-flight state surfaces anything the dead session was holding in working memory. The next actions reduce the cold-start tax. The context section is where the surprise lives, discoveries the dead session made that aren't yet written into permanent specs.

I write the handoff as the last action of any session that's about to be restarted, before the kill. Treating the handoff as a checkpoint rather than a wrap-up changes its quality: it captures live state instead of post-hoc summary.

Anti-patterns

Three patterns I've watched people fall into and had to unlearn myself:

Restart on a timer. "Every two hours" or "after 50 turns" sounds disciplined. It throws away healthy idle sessions and waits too long on actively drifting ones. The signals above are the discipline.

Don't restart at all, just patch around it. "It's drifted but I can work around it." You can, for an hour. Then you've absorbed the drift into your own workflow and the next session inherits your patches as the new baseline. Restart has a half-life; not-restarting has compound interest.

Restart without writing the handoff. This is the one that makes restarts feel expensive. A restart with a good handoff is a checkpoint. A restart without one is a memory wipe. They are not the same operation, even though they look identical from the outside.

The pattern, compressed: don't restart on the clock, restart on the signal. Treat the handoff as the unit of work that survives. Fresh boot is cheap; debugging drift is not. Once the workflow internalizes that, long-running sessions become a tool rather than a liability.

Next up