A context vault for AI agents: persistent knowledge that compounds

There's a layer of knowledge in every project that doesn't fit anywhere. It isn't documentation, too granular, too tactical. It isn't code, too abstract, too narrative. It isn't a chat log, too lossy, too unstructured. And it doesn't survive a session restart, which is exactly when you need it most.

A context vault is the layer that captures it. After eighteen months of running agents that produce more useful insights per week than I can re-derive in a year, the vault has become the most load-bearing piece of my setup. This is what it is, what it isn't, and what makes it work.

The gap between session memory and project docs

Every agent setup has two levels of memory by default. The session itself, which is everything the agent knows in this conversation. And the project docs (READMEs, specs, ARCHITECTURE.md), which are everything anyone who reads the codebase can learn.

There's a third layer in between, and most teams don't have a place for it: the things you discovered while doing the work that don't belong in the docs but shouldn't be lost. The undocumented quirk of the API. The reason the auth middleware was rewritten. The fact that the production migration uses different concurrency settings than staging. The fact that "fix the bug" in one corner of the codebase usually means "rebuild from spec because debugging takes longer."

These are the entries that, if you wrote them as documentation, would make the docs unreadable, too tactical, too contingent, too many of them. But if you don't write them at all, every fresh session re-discovers them, often by re-experiencing the original failure.

The vault is for these.

What a vault is (and what it isn't)

In its simplest form, a vault is a database of small text entries with structured metadata. Each entry has a title, a body, a kind (insight, decision, reference, event), some tags, and an ID. There's a CLI to write and search; there's an MCP server so agents can read and write directly.

That's all the technical surface. The work is in the discipline.

What the vault is not:

Not documentation. Documentation is the answer; the vault holds the journey. If an entry would be improved by being moved into a README, move it. Don't keep both.

Not a journal. A journal records what happened. The vault records what was learned. The distinction matters: an entry titled "spent two hours debugging the auth flow" is journal noise; an entry titled "Express 5 raw body parser breaks Stripe webhook signature verification" is vault gold.

Not a replacement for git. The fix is in the commit. The vault explains why the fix was non-obvious, what category of problem it represents, and what to look for next time. Git is the what; the vault is the what made this hard.

Not a substitute for memory. The agent's session memory is still where the live conversation lives. The vault is the bridge between sessions, not the session itself.

The two-axis taxonomy

The single design choice that makes the vault scale instead of rotting is separating what kind of thing it is from how much retrieval infrastructure it earns.

The first axis is the category: knowledge, entity, or event. Knowledge is atemporal, facts, patterns, frameworks, rules. Entities are slowly-changing stateful objects, people, projects, tools, organizations. Events are time-bound, sessions, commits, incidents, meetings. Every entry belongs to exactly one.

The second axis is the curation tier: bulk, semi, or curated. Bulk means stored raw, no embedding, minimal indexing, emails, prompt history, transcripts. Semi means keyword-indexed but not embedded, meeting notes, partial findings. Curated means full retrieval infrastructure, embeddings, semantic search, optional LLM rerank.

Most one-axis vaults force a bad choice: index everything (cost explodes; signal-to-noise crashes) or be stingy on storage (lose the long-term pattern signal). The two-axis design lets you store everything you might one day want to look at while only paying retrieval costs on the slice that needs them.

The mapping in practice: knowledge is small and curated. Events are unbounded and bulk. Entities sit in the middle. As the vault grows, the curated layer stays manageable; the bulk layer is where the volume goes.

Save triggers that actually fire

The biggest failure mode of a vault isn't bad architecture, it's getting the writes to happen at the right moment. "I'll capture that later" is a lie. The insight is densest right when it's discovered; an hour later it's already half-decayed.

The save triggers that have proven reliable for me, in roughly priority order:

On user correction. If a human corrects an agent ("no, don't do it that way"), save immediately. Title: what got corrected. Body: what the correction was, what the agent had been doing instead, why the correction is generalizable. This is the highest-value class of save.

On non-obvious bug. If the root cause wasn't apparent from the error message, save. Future-you (or future-agent) will hit the same shape of bug and benefit from the writeup.

On undocumented behavior. If you found the answer by reading source code or trial and error rather than docs, save. The docs gap is real; your writeup is now part of the de-facto docs.

On architectural decision with tradeoffs. Decisions get made and then forgotten. The "why we picked X over Y" reasoning is the part future-you most wants to retrieve, and the part most likely to evaporate.

On pattern recognition. Twice is a coincidence; three times is a pattern. When you notice the third instance, the entry isn't about the third instance, it's about the pattern. Save the pattern; reference the instances.

The trigger I deliberately exclude: anything derivable from reading the current code or git history. The vault is for things you can't grep your way to.

Search triggers that surface knowledge in time

Saving without searching is a graveyard. The discipline that makes the vault load-bearing is searching before doing the kind of work the vault was supposed to inform.

The triggers I run every session:

Starting a session. Search for the project context. What did past-me already learn here?

Hitting an error. Search for the error message or root cause before debugging.

Making a decision. Search the area to see if it was already made.

Integrating with a third-party API. Search for the service. Quirks compound.

Entering an unfamiliar module. Search by the module name and adjacent concepts.

Each search takes milliseconds. Each piece of work informed by a relevant past entry saves minutes-to-hours. The asymmetry is the whole point.

What the vault doesn't replace

A vault doesn't replace the discipline of writing good code, good specs, or good docs. It supplements all three, by giving the contingent stuff a place to live so the canonical artifacts can stay clean.

It also doesn't replace human judgment. The vault is retrieved knowledge, facts that were true at some point. Whether they're still true now is a check the agent (or human) has to make every time. Old entries decay. Trust but verify.

Built that way, the vault becomes the layer where individual sessions become collective intelligence. Each agent that writes is contributing to the next one's context. Each session that reads is shorter, sharper, and less prone to the kinds of mistakes that the previous session already made and learned from.

That's the compounding loop. Without it, every fresh session starts from zero. With it, every fresh session starts from everything.