← Writing · No 04 · Architecture

Fractal context: how long-running agents stay coherent

By Felix Hellström · Stockholm · 1310 words

Fractal context: how long-running agents stay coherent

Most attempts to make AI agents work well over long horizons start from the same intuition: the agent needs more context. Bigger context windows. More documents. Larger embeddings. More tools to fetch more memory.

This intuition is wrong, and you can watch it fail in real time. Past a certain point, more context degrades agent quality. The phenomenon has a name now, context rot, and the published research keeps confirming what anyone running long-running agents already knew empirically. Bigger windows don't make the agent smarter; they make the agent's attention thinner.

The frame that does work, I've come to believe, is fractal: context organized not as a flat pool to be expanded but as nested spheres that compose at different scales. This isn't a metaphor I'm forcing onto the problem. The pattern emerged from running fleets of agents and watching what worked.

The problem fractal context solves

The naive design has every agent loading the same context: project docs, the current task spec, the recent conversation. As the project grows, the context grows. The agent has more to attend to and produces worse output for any particular question, because each token now competes with thousands of others for the model's finite attention budget.

The naive fix is selective retrieval: only load the documents most relevant to the current question. This works partially but introduces a different failure: the agent loses the situational awareness that made it useful in the first place. It answers the local question well and the global question terribly. It can't tell when the local answer contradicts the global plan.

Fractal context is the resolution. The agent doesn't have one context; it has several, organized hierarchically, and each sphere knows its scope. The local sphere has the tactical detail. The middle sphere has the project frame. The outer sphere has the strategic intent. The agent moves between spheres, drilling in for tactical work, zooming out for strategic decisions, without ever loading the union of all three.

Why "more context" is the wrong frame

The deeper claim under fractal context is that the right amount of context for any particular decision is small. A senior engineer doesn't keep the entire codebase in their head when refactoring a function. They hold the function's contract, its callers, and the local invariants. They zoom out when something forces them to, a refactor that crosses the contract, a bug whose origin is upstream, and zoom back in once the global question is answered.

LLMs work the same way, but worse: their attention budget is more finite than ours, and their ability to ignore irrelevant context is weaker. Loading the entire project into every prompt is the equivalent of asking that senior engineer to think about the function while continuously reading the rest of the codebase aloud. Performance drops.

The bigger-window arms race obscures this. The implicit promise is that with enough capacity, the agent will figure out what to attend to. The empirical reality is that capacity outruns attention. The model has the room; it doesn't have the focus.

The three spheres

In my own setup, the structure that has worked is three nested spheres:

Local sphere, the task. The current spec, the immediate file scope, the most recent few turns of conversation. Small. Dense. Highest fidelity. This is where actual work happens. Most decisions never leave this sphere.

Project sphere, the workspace. The active arc of the project, the relevant adjacent specs, the catalogued decisions of the project's history. Loaded only when a task crosses its bounds, when a question implicates more than the local scope. The agent zooms out to this sphere consciously, not by default.

System sphere, the canon. The cross-project rules, the architectural principles, the strategic intent of the broader system. Loaded only when the project sphere isn't enough, when the question is, "should this project even exist in this form?" or "what would the broader system want here?" The agent rarely needs this sphere; when it does, the question is foundational.

Each sphere has its own vocabulary, its own pacing, its own kind of artifact. Local sphere artifacts are commits and patches. Project sphere artifacts are specs and arc updates. System sphere artifacts are rules and canon docs. The fractal isn't just nested context; it's nested kinds of work.

How the spheres compose

The hard part of any nested system is the composition rule: when does the agent move between spheres, and how does it carry information across the boundary?

The pattern that has held up:

Default to the smallest sphere. Every task starts in the local sphere. The agent only zooms out when the local sphere is genuinely insufficient, when a decision needs information it doesn't have, when a constraint surfaces that's larger than the task scope, when an inconsistency between scopes appears.

Zoom out for resolution, not exploration. Going to the project sphere is a deliberate move triggered by a specific need. It's not "let me load everything just in case." That habit is the failure mode. The trigger has to be a concrete question the local sphere can't answer.

Carry decisions back, not context. When the agent zooms out and gets an answer, it carries the decision back to the local sphere, not the entire context that produced the decision. The local sphere stays small. The project sphere is consulted, not absorbed.

Surface friction across boundaries. When a local task discovers something that should change the project sphere (a missing rule, a wrong spec section, a stale assumption), it doesn't fix it from the local sphere. It surfaces it. The project sphere is changed by project-sphere-level work, not by local edits creeping upward.

This last rule is what keeps the system honest. Without it, the local sphere bleeds into the project sphere over time, and you end up with the same flat-context failure mode wearing fractal clothes.

When the pattern doesn't apply

The fractal frame is right for long-running, high-stakes, multi-scope work. It's overkill for short, bounded, single-scope tasks.

If you're prompting an agent to fix one bug in one file with no architectural implications, the fractal model is overhead. Load the relevant code, load the bug report, do the fix. There's no project sphere to consult; the local sphere is the entire problem.

The pattern earns its weight when the work runs across days, projects, or domains. When the agent has to remember not just what it's doing now but what the broader system is for. When questions can change scope mid-task, when "fix this small thing" turns out to be "this small thing is a symptom of a wrong premise three layers up." That's where flat context breaks and fractal context holds.

Practical implications for builders

Three concrete moves that fall out of taking fractal context seriously:

Don't put everything in CLAUDE.md. The temptation is to load the entire system into every session via the project's instructions file. Resist. CLAUDE.md belongs to the project sphere; loading it into every session means every local task is paying the project sphere tax. Keep CLAUDE.md focused on what the agent needs to know to start work, not everything the agent might want to know about the project.

Make the boundaries visible. When a task crosses spheres, write it down. The decision to zoom out is more important than the answer it produces. Logging the boundary crossings lets future sessions (and future-you) see when the system is working and when it isn't.

Build retrieval around scope, not similarity. Vector similarity search retrieves what's most similar to the query. Fractal context wants what's most relevant at the right scope. The retrieval system has to know which sphere it's serving, not just which document is closest. This is harder to build than vanilla RAG and worth it for any agent that runs longer than a single task.

The compounding insight under all three: the agent's context isn't a pool, it's a hierarchy. Designing the hierarchy is the work. The window size is a constraint, not a strategy.

Next up