Technology Apr 17, 2026 · 9 min read

Claude Code forgot my architecture 3 times last week. I fixed it with one SQLite file.

Claude Code forgot my entire architecture decision log three times last week. After the third time, I stopped cursing at it and shipped a tool that does what it won't. Here's the whole thing. TL;DR: Waypath 0.1.1 is a local-first CLI and MCP server that gives coding agents (Claude Code, Codex, Cur...

DE
DEV Community
by thestack_ai
Claude Code forgot my architecture 3 times last week. I fixed it with one SQLite file.

Claude Code forgot my entire architecture decision log three times last week. After the third time, I stopped cursing at it and shipped a tool that does what it won't. Here's the whole thing.

TL;DR: Waypath 0.1.1 is a local-first CLI and MCP server that gives coding agents (Claude Code, Codex, Cursor, Aider) persistent memory through a single SQLite file at ~/.waypath/waypath.db. Four independent kernels sit behind a thin facade — truth (structured facts), archive (FTS5 + reciprocal rank fusion), ontology (graph traversal), and promotion (human review gate). TypeScript, Node 22, MIT, 131 passing tests, zero cloud dependencies. npm install -g waypath.

The problem: coding agents have no memory between sessions

Coding agents are brilliant within a session and completely amnesiac between them. After 18 months building AI tooling and coding-agent infrastructure, every workaround I tried eventually broke:

  • CLAUDE.md files balloon to 3,000+ lines, then get silently truncated by context-window compaction. The agent loses the bottom third without warning.
  • Cloud-hosted "memory" services send your proprietary code to a third party and bill per token — $20–$50/month in practice. Hard no.
  • Vector databases handle similarity search fine but can't answer "what did I decide about auth two weeks ago?" without serious glue code.
  • Prompt stuffing works until it doesn't — then it poisons the context window with stale decisions.

What I actually wanted was simple. A local place to store facts, decisions, and session transcripts. Keyword search that works on code-heavy text. A graph layer for entity relationships. And a gate so the agent can't silently pollute memory with hallucinated "decisions."

That's Waypath.

Architecture: four independent kernels over one SQLite file

Waypath stores everything in a single SQLite file — no separate vector DB, no remote service, no background daemon. On top of that file sit four independent kernels:

Kernel Responsibility Backed by
truth Structured facts, decisions, preferences SQLite tables with JSON1
archive Session transcripts and long-form notes FTS5 + RRF hybrid retrieval
ontology Entity graph: people, projects, tools Recursive CTE traversal
promotion Human-in-the-loop review gate Status tables + review CLI

Each kernel is a standalone TypeScript module with its own test suite. A thin facade in src/core/facade.ts exposes the public API. The kernels don't import each other. Promotion drives cross-kernel state via explicit events, not direct calls.

This is intentionally crude. I can rewrite the archive kernel tomorrow without touching truth or ontology. For a 0.1.x tool, "boring and swappable" beats clever every time.

Archive kernel: FTS5 + reciprocal rank fusion, no GPU required

Waypath hits sub-40ms hybrid retrieval with no embeddings, no GPU, no network calls — just SQLite FTS5 plus recency-weighted reciprocal rank fusion (RRF).

The archive kernel stores session transcripts and paged-in documents. Retrieval needs two things: keyword search that actually works on code (function names, error messages, stack traces), and semantic-ish relevance without shipping a local embedding model.

FTS5 handles the keyword side. For ranking, BM25 scores combine with a recency prior and fuse via RRF:

// src/archive/retrieval.ts (simplified)
const rrf = (rank: number, k = 60) => 1 / (k + rank);

const fused = new Map<string, number>();
for (const [i, row] of bm25Rows.entries()) {
  fused.set(row.id, (fused.get(row.id) ?? 0) + rrf(i + 1));
}
for (const [i, row] of recencyRows.entries()) {
  fused.set(row.id, (fused.get(row.id) ?? 0) + rrf(i + 1));
}

On an M2 MacBook Air with a 120MB archive (~4,000 session entries), `recall` p95 stays under 40ms. Less sophisticated than a dense retriever, sure. **It's also free, deterministic, and ships in a sub-1MB install with zero native compile steps.** That tradeoff felt right for a daily-driver tool.

## Ontology kernel: a knowledge graph with no Neo4j

**The ontology kernel builds a lightweight entity graph directly on SQLite — two tables and a recursive CTE, no graph database required.**

Once facts accumulate, "what do I know about entity X?" becomes a useful query. The ontology kernel walks `entities` and `edges` with a depth-capped recursive CTE:


sql
WITH RECURSIVE related(id, depth) AS (
SELECT target_id, 1 FROM edges WHERE source_id = ?
UNION ALL
SELECT e.target_id, r.depth + 1
FROM edges e JOIN related r ON e.source_id = r.id
WHERE r.depth < 3
)
SELECT DISTINCT e.*, related.depth FROM entities e
JOIN related ON e.id = related.id
ORDER BY related.depth;

For a personal knowledge graph of a few thousand nodes, SQLite is plenty. The depth cap (default 3) keeps queries bounded and stops runaway traversal on densely connected graphs.

Promotion kernel: the review gate that stops memory drift

The single most valuable feature of Waypath is a human review gate that stops agents from silently writing hallucinated decisions into permanent memory.

Every item in Waypath has a status: draft, promoted, or rejected. Agents write freely to draft. They cannot surface or cite promoted memory without explicit human approval. The review CLI makes this a 30-second daily habit:

$ waypath review
[1/7] draft/decision: "Use Postgres 16 for the event store"
      source: session 2026-04-12T14:03
      proposed by: claude-code
      (p)romote / (r)eject / (e)dit / (s)kip:

**Nothing becomes "knowledge" without an explicit human decision.** The friction is annoying for about two days — then it becomes the feature. Memory stops drifting because you watch what enters.

The governance rules are deliberately simple:

- Agents propose via `waypath promote <id>`; **nothing auto-promotes**.
- Rejected entries stay in the archive but are excluded from retrieval.
- A whole source can be flagged `untrusted` via `waypath source-status`, which soft-deletes everything indexed from it.

I considered trust-scored auto-promotion. Then I watched an agent confidently "remember" a decision I never made. **The gate stays.**

## MCP server: six tools, any MCP-compatible agent

**Waypath 0.1.1 ships a native MCP server that exposes six tools and works with any MCP-compatible coding agent** — Claude Code, Codex, Cursor, or Aider with MCP enabled.

0.1.1 ships three integration paths:

1. **Native MCP server binary**`waypath-mcp-server` exposes: `recall`, `page`, `promote`, `review`, `graph-query`, `source-status`.
2. **Claude Code host shim** — pre-baked config and slash commands.
3. **Codex host shim** — equivalent for Codex.

Claude Code config:


json
{
"mcpServers": {
"waypath": {
"command": "waypath-mcp-server",
"args": ["--db", "~/.waypath/waypath.db"]
}
}
}

The tool surface is intentionally narrow. recall runs hybrid retrieval. page loads a full document by ID. graph-query walks the ontology up to depth 3. promote and review flow through the governance gate — the agent can propose but cannot approve. source-status lets you quarantine a whole source if one session went off the rails.

Stack and technical choices

  • TypeScript 5.7, strict mode, zero any.
  • Node 22 target — native SQLite runtime where available; better-sqlite3 as fallback for older Node.
  • Vitest, 131 passing tests across all four kernels.
  • MIT license, single-binary npm distribution, no native compile step unless you opt into better-sqlite3.
  • Zero cloud dependencies — Waypath never dials home.

What I already regret in 0.1.1

Three design decisions I'm fixing in 0.2:

  1. The facade is too thin. I wanted "just a router" and ended up duplicating Zod schemas across three kernels. 0.2 lifts the shared schemas one layer up.
  2. Archive IDs are sequential integers. Fine for tests; terrible for merging two Waypath databases across machines. Switching to ULIDs in 0.2.
  3. No WAL checkpoint tuning. Under a write-heavy session the -wal sidecar grows past 200MB before SQLite triggers an automatic checkpoint. PRAGMA wal_autocheckpoint lands in the next patch.

Performance and cost

Waypath on my M2 MacBook Air, archive of ~4,000 session entries (~120MB on disk):

Operation p50 p95
recall (hybrid retrieval) ~18ms ~39ms
page (document fetch by ID) ~2ms ~6ms
graph-query (depth 3) ~24ms ~71ms
promote (commit + reindex) ~11ms ~28ms

How it compares to what I was using before:

Waypath 0.1.1 Cloud memory services Vector DB only
Monthly cost $0 $20–$50+ $0–$20
Data leaves machine No Yes Sometimes
Works offline Yes No Yes
Hybrid retrieval Yes (BM25 + recency RRF) Varies Semantic only
Human review gate Yes Rarely No
Install npm i -g waypath Signup flow Docker compose

FTS5 scales comfortably into multi-gigabyte territory — hybrid retrieval stays in the tens of milliseconds well into large archives, based on my testing.

FAQ

Q: Does Waypath need an embedding model or GPU?
No. 0.1.1 uses SQLite FTS5 and recency-weighted reciprocal rank fusion for all retrieval. Runs on any machine that runs Node 22. Optional local embeddings may land later; the embedding-free default stays.

Q: Can I use Waypath with agents other than Claude Code and Codex?
Yes. waypath-mcp-server speaks the Model Context Protocol, so any MCP-compatible client — Cursor, Aider with MCP, custom agents — connects by pointing its config at the binary. The Claude Code and Codex shims are conveniences, not requirements.

Q: What happens when my SQLite file gets huge?
FTS5 scales comfortably into multi-gigabyte territory. When chunks go irrelevant, waypath source-status --archive <source> soft-deletes everything from that source so retrieval stays clean without modifying the underlying rows.

Q: Why a review gate instead of auto-promotion with trust scores?
Because agents hallucinate with confidence. A "confidence 0.94" score on a fabricated decision pollutes your memory permanently. An explicit review step costs roughly 30 seconds a day and eliminates an entire class of failure. I'll reconsider if someone ships a convincing auto-trust model; until then, the gate stays.

Q: Is my data encrypted at rest?
Not by default. SQLite supports SQLCipher as a drop-in, and Waypath 0.2 will ship an opt-in encryption flag. For 0.1.1, protect ~/.waypath/waypath.db at the filesystem level (FileVault on macOS, LUKS or dm-crypt on Linux).

Get started in five minutes

  1. Install: npm install -g waypath
  2. Initialize: waypath init
  3. Wire into Claude Code: paste the MCP server block above into your config and restart.
  4. Code normally. At end of session, run waypath review to promote what mattered.
  5. Next session, ask your agent: recall decisions about <topic>.

Install, init, and MCP wiring — under five minutes, total. If you hit a snag, open an issue and I'll look at it same day.

One question before you go

What's your current approach to persistent context for coding agents? I'm especially curious about the promote-review governance model — is 30 seconds of daily friction worth the hallucination protection, or would you rather have trust-scored auto-promotion with a kill switch?

Drop your answer in the comments. I'm actively shaping 0.2 based on real usage, and this specific tradeoff is still live. If you want to be notified when 0.2 ships with ULIDs and proper WAL handling, follow me here on dev.to — I post updates as the governance model evolves.

Already know you want this? Star the repo so you don't lose it.

Repo: https://github.com/TheStack-ai/waypath

npm: https://www.npmjs.com/package/waypath

I've been building AI tooling and coding-agent infrastructure for 18 months. In that time I've shipped production MCP servers, broken more CLAUDE.md files than I care to admit, and watched agents hallucinate decisions with alarming confidence. Waypath is the tool I actually needed. Issues and PRs welcome — I read everything.

DE
Source

This article was originally published by DEV Community and written by thestack_ai.

Read original article on DEV Community
Back to Discover

Reading List