Open KnowledgeOpen Knowledge
Internals

Architecture

Foundation architecture validated through the init spike -- TipTap, Hocuspocus, Yjs, CodeMirror, and git auto-persistence.

Init spike architecture

This page describes the architecture as validated during the init spike (March 2026). The codebase has evolved since -- shadow repo, ContentFilter, branch parking, and page CRUD APIs have been added. For the current production topology, see Service Topology.

The architecture was validated through a structured spike with seven targeted validations (V1--V7). Six passed, one failed (V7 -- Yjs v14 delta protocol), which confirmed the expected fallback path.

System overview

Browser (Vite)                        Server (embedded in Vite)
+-----------------------+             +-------------------------+
| TipTap v3 Editor      |  WebSocket  | Hocuspocus              |
| + y-prosemirror       | <========> | + DirectConnection API  |
| + Collaboration ext   |   /collab   | + Persistence extension |
+-----------------------+             +-------------------------+
| CodeMirror 6          |                      |
| (source toggle)       |              onStoreDocument hook
+-----------------------+                      |
                                    +----------v----------+
                                    | Layer 1: CRDT->disk |
                                    | (2-10s debounce)    |
                                    +----------+----------+
                                               |
                                    +----------v----------+
                                    | Layer 2: disk->git  |
                                    | (30s debounce)      |
                                    | WIP refs, plumbing  |
                                    +---------------------+

Editor layer

TipTap v3 with ProseMirror provides the WYSIWYG editing surface. Key extensions:

  • Collaboration (@tiptap/extension-collaboration) -- binds Y.Doc from Hocuspocus provider to the editor via y-prosemirror
  • Frontmatter -- regex strip before parse, re-prepend after serialize (~25 LOC)
  • Image (@tiptap/extension-image) -- built-in markdown support in TipTap v3
  • Task lists (TaskList + TaskItem) -- native markdown round-trip in v3
  • JsxComponent -- custom void node extension for embedding React components as fenced code blocks with jsx-component info string

Markdown round-trip fidelity: zero semantic loss after ~80 LOC of fixes. Convergence confirmed (cycle 2 byte-identical to cycle 1).

CRDT layer

Yjs v13 with y-prosemirror provides conflict-free concurrent editing. The Yjs v14 unified delta protocol was tested (V7) but is not yet viable -- the ecosystem pins to v13.

Key constraint: source toggle uses updateYFragment() (diff-based), never prosemirrorJSONToYDoc() which would destroy collaboration state.

Collab server

Hocuspocus embeds in Vite via configureServer() plugin hook with a standalone ws.WebSocketServer({ noServer: true }). No listen() call -- the embedding pattern intercepts WebSocket upgrades on /collab.

Two agent write endpoints use hocuspocus.openDirectConnection():

  • POST /api/agent-write -- raw Y.XmlElement write (appends a paragraph with applyDelta())
  • POST /api/agent-write-md -- markdown write (unified path). Accepts { markdown, position? }. Routes through applyAgentMarkdownWrite (XmlFragment-authoritative composition per AGENTS.md precedent #10): reads the current Y.XmlFragment (reflects all CRDT-synced content including concurrent client WYSIWYG typing), composes the agent's delta at the markdown level per position ('append' / 'prepend' / 'replace'), applies to XmlFragment via updateYFragment() (structural diff preserves user-content Items), then mirrors Y.Text via applyFastDiff (character-level DMP write from @inkeep/open-knowledge-core/bridge; minimal mutation, preserves non-agent Y.Text Items and their origins). Replaces the deleted syncTextToFragment which used Y.Text as the authoritative input and destroyed concurrent user XmlFragment content — Bug-A in the 2026-04-14-bridge-convergence-under-concurrent-writes spec.

Source toggle

Two-mode toggle between WYSIWYG (TipTap) and source (CodeMirror 6). Both editors mount concurrently per active document via EditorActivityPool (display:none swap), each bound to its Y type for the lifetime of the mount:

  • TipTap → Y.XmlFragment('default') via @tiptap/extension-collaboration
  • CodeMirror → Y.Text('source') via y-codemirror.next

Toggling between modes is a CSS visibility flip — no MarkdownManager.serialize, no client-side three-way merge, no snapshot. The server's bidirectional bridge keeps the two Y types in continuous sync, so each editor view is always current when revealed. (The init-spike V4b implementation used a three-way-merge.ts reconciler on toggle-back; that module was retired alongside the move to a server-authoritative bridge — see Service Topology and the agent-write-path page for the current model.)

Bridge dispatch: Cross-CRDT sync between Y.XmlFragment and Y.Text is server-authoritative (precedent #14) and runs under doc.on('afterAllTransactions', ...) — one settlement fire per outermost doc.transact() drain (precedent #13(b)). Observer A (XmlFragment → Y.Text) runs before Observer B (Y.Text → XmlFragment) within each drain, so any Y.Text write from A is visible to B's read; no wall-clock debounce is involved. The client observer is a baseline-tracking shell that does not write the derived CRDT — keystroke-level "typing-defer" coordination is no longer needed because the only path that previously needed protection was the deleted client-side cross-CRDT write.

Content preservation under concurrent edits: Observer A's Path B (used when local Y.Text has diverged from the last-synced XmlFragment baseline) runs the hybrid diff3+DMP mergeThreeWay algorithm with a content-preservation post-condition. Post-condition violations throw BridgeMergeContentLossError; in production the bridge logs a structured event, queues a silent named checkpoint via saveInMemoryCheckpoint (recoverable via TimelinePanel), and applies the merge as-computed. See specs/2026-04-16-bridge-correctness/SPEC.md §6 R1/R7 for the full contract.

Persistence pipeline

Three-tier auto-persistence with no "save" button:

  1. Crash recovery (CRDT to disk) -- Hocuspocus onStoreDocument hook, 2s quiet / 10s max debounce
  2. Auto-commits -- simple-git plumbing: git add -> write-tree -> commit-tree -> update-ref refs/wip/main, 30s debounce
  3. Named checkpoints -- user-initiated (future)

Server-side serialization uses yXmlFragmentToProsemirrorJSON() (pure Yjs/JSON, no DOM) then MarkdownManager.serialize() to markdown string.

Void nodes (React component preview)

JSX components embedded in markdown as fenced code blocks:

```jsx-component
<Callout type="warning">
  Always run the integration tests before deploying to production.
</Callout>
```

The JsxComponent extension intercepts code tokens with lang === 'jsx-component' at priority 60 (above codeBlock's 50). Known components get visual preview via ReactNodeViewRenderer. The raw JSX string survives the markdown round-trip unchanged.