Service topology

How the Hocuspocus server, file watcher, persistence pipeline, and shadow repo work together per project.

Each Open Knowledge project runs its own pair of sibling processes. There is no global daemon or project registry. Running multiple projects means multiple process pairs on different ports — each fully independent with its own persistence pipeline, file watcher, shadow repo, and lockfile pair (.open-knowledge/{server,ui}.lock).

Dual-process lifecycle

Production runs two sibling processes coordinated via lockfiles:

ok start — Hocuspocus collab server (/collab WebSocket + /api/* HTTP). Owns server.lock; advertises its kernel-allocated port for MCP discovery.
ok ui — React editor + GET /api/config. Owns ui.lock; defaults to port 3000 (or PORT env / --port). Serves the static bundle and reads the live collab URL from server.lock on every /api/config request.

ok start auto-spawns ok ui when ui.lock is absent or stale. ok mcp auto-spawns ok start under the same discipline. The spawn is always detached (child_process.spawn with detached: true + child.unref()) so the spawned process runs in its own process group — Claude Code's kill-on-session-end model cannot reach it through the MCP stdio.

Teardown: when zero WebSocket clients are connected at /collab for 30 minutes, ok start's idle-shutdown SIGTERMs the ui.lock.pid sibling, drains its own phases, and releases server.lock as the final step. ok ui carries a 12h safety-net timer (D-025) that self-terminates independently if ok start ever crashes without sending SIGTERM.

In dev (bun run dev), a single Vite process serves both surfaces — the production split does not apply.

Per-project architecture

createServer() in packages/server/src/standalone.ts takes a single contentDir and creates all components scoped to that directory.

Every box in this diagram is scoped to the project. The ContentFilter gates which files participate. The Hocuspocus Server manages Y.Doc instances. The File Watcher and Persistence Extension form a bidirectional bridge between disk and CRDT. The Shadow Repo and HEAD Watcher handle git integration.

Persistence pipeline

Three layers, each with its own cadence and responsibility.

Layer 0: Disk to CRDT (file watcher)

@parcel/watcher detects external .md file changes on disk and reconciles them into the CRDT.

ContentFilter drops excluded files (gitignored, config patterns)
Symlinks resolved to canonical paths via realpath; multiple paths to the same file share one Y.Doc
Events classified as create, update, delete, rename, or conflict
Self-writes skipped via content-hash matching
External changes reconciled into Y.Doc via three-way merge (base/ours/theirs)
Latency: 2--52ms event detection

Layer 1: CRDT to Disk (Hocuspocus persistence)

The onStoreDocument hook fires after Y.Doc changes and flushes the document to disk.

Y.Doc serialized to markdown (frontmatter preserved via Y.Map cache)
Atomic write: temp file then rename (symlinks preserved -- writes target the canonical path)
Write hash registered in writeTracker for feedback prevention
Symlink-escape check refuses writes that resolve outside contentDir
Debounce: 2s quiet, 10s max

Layer 2: Disk to Git (shadow repo)

Scheduled after Layer 1 writes complete to capture the change in version history.

commitWip() to shadow repo
Debounce: 30s idle

Feedback prevention

Two mechanisms prevent write loops between the file watcher and persistence. Content-hash tracking ensures the watcher skips its own persistence writes. The skipStoreHooks flag ensures external changes loaded from disk do not trigger a re-write.

On graceful shutdown, destroy() drains both layers in order -- L1 first, then L2 -- to guarantee all pending writes reach disk and git before the process exits. See Server Lifecycle for the full shutdown phase sequence.

Shadow repo

Located at <projectRoot>/.git/open-knowledge/. If the project has no .git/ when the server starts, ensureProjectGit auto-runs git init --initial-branch=main before the shadow is created (fail-fast on git missing). The shadow repo provides auto-save history and attribution without touching the project's own git state.

It contains:

WIP refs -- per-writer auto-commit history (refs/wip/<branch>/<writer-id>)
Upstream imports -- records of git pull/merge/rebase changes
Branch parking -- on git checkout, the server parks current Y.Doc state to shadow refs and restores it on return
Rescue buffers -- dirty documents from deleted or branch-switched files

Isolation guarantee

The shadow repo never touches the project's git ref namespace or object store. It is an isolated journal for auto-save and attribution.

On startup, a HEAD-drift check compares the stored last-known-head SHA against the current project HEAD. If they differ (including the fresh-clone case of null to SHA), commitUpstreamImport() records the drift in the shadow repo so offline git operations appear in the timeline.

SyncEngine

When the project has a git remote and sync is enabled, createServer() starts a SyncEngine that keeps the local repository synchronized with origin.

The engine runs two independent timer loops:

Pull loop: fetches from origin on a configurable interval (default 30 seconds, +/-15% jitter). If behind and no conflicts, merges automatically. On merge conflict, pauses and surfaces the conflict through the editor UI.
Push loop: pushes local commits to origin on a separate interval (default 60 seconds, +/-15% jitter). Never force-pushes. On non-fast-forward rejection, triggers a pull-then-retry.

Both timers use chained setTimeout -- the next timer starts only after the current operation completes, preventing overlap.

State persists to <contentDir>/.open-knowledge/sync-state.json so restart recovery computes the remaining wait from the last operation timestamp rather than restarting from zero. Conflict state persists to <contentDir>/.open-knowledge/conflicts.json.

Error classification uses a 5-class taxonomy (network, auth, semantic, structural, local). Network errors trigger counted exponential backoff (5 min after 3 failures, 15 min after 5, 60 min after 8). Auth errors pause sync and surface a re-authentication prompt. Protected-branch rejection disables sync for the project.

The engine emits state transitions on the sync-status CC1 channel (see below), which the editor's SyncStatusBadge subscribes to for real-time status updates.

Derived-view push (CC1)

A dedicated __system__ Y.Doc carries push signals for derived views (file list, backlinks, future graph panels). The server pre-materializes it at startup via openDirectConnection so events that arrive before any browser connects have a broadcast target. When the file watcher emits a create, delete, or rename, the CC1 broadcaster debounces 100ms and broadcasts a pure signal -- {v:1, ch:'files', seq} -- to every connected client via Document#broadcastStateless. Clients respond by re-fetching the channel's REST endpoint rather than decoding a per-event payload.

__system__ is not a content doc. Every subsystem that keys off documentName short-circuits via the isSystemDoc() helper in packages/server/src/cc1-broadcast.ts, and ContentFilter rejects user-created __system__.md at admit time.

Active channels: files (file list changes), backlinks, graph, and sync-status (SyncEngine state transitions). Each channel is a pure signal -- clients re-fetch the channel's REST endpoint on receipt.

Agent presence

The same __system__ Y.Doc carries a second concern: a map-valued agentPresence: Record<agentId, AgentPresenceEntry> field on its awareness state, published by packages/server/src/agent-presence.ts and read by the browser's presence bar. This is the canonical substrate for "who is writing right now" -- any N concurrent agents (Claude, Cursor, two Claudes in different terminals, etc.) coexist as distinct map entries.

A map-valued field is required because every Hocuspocus Document has exactly one server-side Awareness instance with one clientID; per-content-doc agent state would stomp across N concurrent agents. The three agent write handlers (handleAgentWrite, handleAgentWriteMd, handleAgentPatch in packages/server/src/api-extension.ts) call setPresence(agentId, {..., mode:'editing'}) before the transact and touchMode(agentId, 'idle') in a finally so a thrown write still flips the badge back to idle.

Cleanup is deterministic via the MCP keepalive WebSocket. The keepalive URL carries agentId=${connectionId}; boot.ts's /collab/keepalive upgrade handler wires ws.on('close') -> clearPresence(agentId) so the badge disappears within ms of the MCP process exiting. A 5-second TTL on the client side (AGENT_PRESENCE_STALE_MS) is a belt-and-suspenders fallback for ungraceful disconnects, clock skew, and proxies that eat the close frame.

GET /api/metrics/agent-presence returns the current map for operator diagnostics only. The client does not poll it -- browser tabs populate the presence bar from the __system__ provider's awareness sync, which typically delivers within the WS handshake window. The endpoint is stable surface for curl + dashboards; it is NOT a client-side cold-start fallback in this iteration.

Branch awareness

The reconciledBase (three-way merge base) is scoped by branch name: Map<branch, Map<docName, content>>. When the HEAD watcher detects a branch switch:

Current Y.Doc state is parked to shadow refs
All open documents reset from the target branch's disk content
Parked WIP from a prior visit is restored via three-way merge

This means switching branches in git and switching branches in the editor are the same operation -- the server follows HEAD.

Dev mode vs production

Aspect	Dev (`bun run dev`)	Production (`ok start` + `ok ui`)
Process model	Single Vite process serves both collab + React bundle	Two sibling processes (`ok start` + `ok ui`) coordinated via `server.lock` + `ui.lock`
Collab port	5173 (Vite default)	Kernel-allocated, advertised in `server.lock`
UI port	5173 (same as collab)	3000 by default on `ok ui` (or `PORT` env / `--port`)
Git integration	No shadow repo, no HEAD watcher	Full shadow repo + HEAD watcher
File watcher	Simplified: create/update only	Full: create/update/delete/rename/conflict
Content filter	Basic .md filtering	Full gitignore + config patterns
Test routes	Enabled (`/api/test-reset`)	Disabled by default
Auto-shutdown	None	30-min idle-shutdown on `ok start`; 12h safety-net on `ok ui`

In dev mode, the Vite plugin (packages/app/src/server/hocuspocus-plugin.ts) embeds Hocuspocus into the Vite dev server process. A single bun run dev starts both the editor frontend and the collab server on port 5173. Git integration features are disabled to keep the dev loop fast and side-effect-free.

In production, ok start runs the collab server only (/collab + /api/*) and auto-spawns ok ui as a detached sibling that serves the React bundle + GET /api/config endpoint. The split lets MCP tools auto-spawn the collab server on demand without resurrecting a browser UI nobody asked for, and lets Claude Code's preview pane launch just the UI via .claude/launch.json (runtimeArgs: ['@inkeep/open-knowledge', 'ui'], autoPort: true).

Service topology

On this page