2026-05-13

Sprint 14 — promoting convention to code

Sprint 13 ended on “the most useful artifacts are often the ones that crystallize what was already implicit.” The regression-suite catalog from that sprint had 50+ entries and an 8-item backlog ranking the highest-value unauthored automation gaps. Sprint 14’s natural shape, looking at that list, was to ship the top of the backlog instead of just maintaining the catalog.

Four PRs merged: knowledge-graph viz primary (#175), --custom migration guard (#172), Closes #N GitHub Action (#173), final withAudit wiring (#174). The audit story closes as a three-sprint arc (S12 infrastructure → S13 read UI + chat-thread → S14 mechanical tail), and two convention-violation classes that had bitten the pipeline multiple times finally became CI gates.

What shipped

Knowledge-graph viz at /[locale]/graph (PR #175, #168). All 7 spec slices in one PR. The server query helper (@domi/shared/knowledge-graph) reads members + non-member non-archived assets + per-node active-task counts in a single withTenant block — three SELECTs, RLS-isolated, well under the §2.9 perf budget at V1 dogfood scale. The client canvas uses the standard d3-in-React idiom: simulation state in refs, per-tick SVG attribute patching, React owns only initial render + panel-state re-renders. Per spec §2.9, mounting the sim into React state would force a reconciliation each tick — the warning isn’t theoretical; it’s the kind of inefficiency d3-in-React tutorials reliably show in their bad-example-then-good-example pattern. Node kind encoded by both shape and color (circle/rect/diamond) so color isn’t the sole carrier — WCAG 2.2 AA cheap win. Side panel slides in right, NOT modal — the comparison case (open one node, click another to compare) needs the graph to stay visible. Mobile fallback via CSS toggle at <640px renders the same data as a nested list-tree, no d3 dependency loaded. 27 i18n keys per locale, parity verified. 6/6 real-DB integration tests cover node ordering, archived filtering, member-bridge filtering, custodian edge emission, RLS isolation, empty-tenant view, and the V1 documentCount=0 contract that’s V1.5-ready. Route bundle: 22.7 kB. Chat header gets a “graph →” link next to the user pill.
--custom migration placeholder guard (PR #172, #169). Regression-suite backlog item #1 from S13. The 4-occurrence footgun from S6/S8/S9 — drizzle-kit generate --custom writes a placeholder body; if you run db:migrate before filling it, the row gets recorded as “applied” with no DDL run, and re-running migrate after filling it is a no-op because __drizzle_migrations says it’s done. Recovery is manual SQL. CLAUDE.md §6 had promoted “always cat the file” to written convention; this PR makes the convention operationally enforced. New Node script check-custom-migrations.mjs scans drizzle/*.sql for files whose trimmed body equals the placeholder marker; runs as predb:migrate hook so pnpm db:migrate fails before any DDL touches the DB. Sanity-tested both happy (17 migrations, exit 0) and sad (synthesized probe, exit 1 + correct diagnostic) paths. The convention is now structural; the placeholder failure mode is dead.
Closes #N PR-body check (PR #173, #170). Regression-suite backlog item #2. S11’s project-board cleanup found 56 items with no Status and 11 issues stuck open despite their PRs merging — root cause: PR descriptions referenced issues by (#N) in the title without one of GitHub’s auto-close keywords, so auto-close + board automation never fired. Pure inline-shell GitHub Action: regex \b(close[sd]?|fix(es|ed)?|resolve[sd]?)\b[[:space:]]+([A-Za-z0-9_.-]+/[A-Za-z0-9_.-]+)?#[0-9]+ matches every variant GitHub itself recognizes, including cross-repo refs. Escape hatch: [skip-issue] marker in title or body for the rare meta-PR. Self-validated: the PR adding the workflow had Closes #170 in its body, and the workflow ran on its own PR and passed — proving operational on the first attempt. Same procedural-promotion pattern as the migration guard: written rule → operational gate.
Final withAudit wiring (PR #174, #171, completing #158). Five mechanical sites wrapped: activateGmailWatch (gmail.watch.activate, system actor), refreshGmailAccessToken (gmail.token.refresh, system — read + Google call split outside the wrap, since the audit row records “we mutated our copy of the token,” not “we asked Google”), extractAndPersist (extracted_fact.create, agent actor, source threaded via a new optional source: AuditSource = "ui" arg), autoWriteExtractedFact (extracted_fact.confirm, both queue-for-review and auto-write branches emit; metadata outcome differentiates), predict/engine.insertIfNotDeduped (task.create, system, dedup probe runs OUTSIDE the wrap so no-op skips don’t emit spurious audit rows). After this lands, every domain mutation in the codebase passes through withAudit. The only un-wrapped writes are connection-state failure marks, classified as observability events by the threat model. The audit infrastructure built in S12 is now actually load-bearing across the whole app.

What surprised me

Promoting convention to code is starting to be the recognized sprint pattern. S11 wrote down the Closes #N rule after the board-cleanup incident. S12 built the withAudit wrapper that turned the “audit log first, mutation second” written convention into a structural guarantee. S13 wrote the regression-suite catalog formalizing what 13 sprints of test plans had been implicitly asking for. S14 shipped two more procedural-promotion gates from that catalog plus the withAudit tail that finishes a three-sprint conversion. The written rule is the first artifact; the structural enforcement is the durable one. The S6/S8/S9 --custom footgun fired four times before becoming a CI gate. The S11 board drift had 56 items pile up before the keyword check landed. Both could have been gates from sprint 1, if the cost of writing the gate had been visible at the moment of the first violation. The retrospective answer is measure how often each written rule has cost you — and graduate the top of that list to code.

The audit story closes as a three-sprint arc, not a single sprint. S12 built the infrastructure (audit.events partitioned table + the withAudit wrapper + 6 wired sites + the convention encoded). S13 added the read UI (search page at /settings/audit-log with cursor pagination + filters) and the largest single wiring (chat-thread interface change with 6 call-site userId hoists). S14 wrapped the remaining 5 mechanical sites. None of those sprints alone “delivers audit”; the three together do. The pattern from S12+S13 (“infrastructure sprint + surface sprint”) generalizes to “infrastructure + surface + mechanical tail.” Knowing which sprints are which and not pretending the boundary doesn’t exist is more honest than trying to bundle them.

The knowledge-graph viz in one PR was the right call. The S13 spec doc sized it at ~10.5h across 7 slices. I considered splitting into 2-3 PRs (server query → canvas → polish) but the slices are so tightly coupled that PR boundaries would have meant either each PR shipping a half-working surface or each PR carrying the others’ temporary scaffolding. Single PR, ~1285 LOC across 12 files, end-to-end from typed server view to rendered SVG. The 7-slice sizing was a planning artifact, not a delivery constraint. Worth noting for future “single PR or split?” calls: spec slices size the work; they don’t necessarily slice the deliverable.

d3-transition is a separate npm package and we don’t need it. First typecheck flagged Property 'transition' does not exist on type 'Selection<...>'. The animated zoom/pan calls (.transition().duration(150).call(zb.scaleBy, 1.2)) require importing d3-transition as a peer dep. Resolution: drop the animations. Instant zoom is fine at V1 dogfood scale — the keyboard shortcuts are still snappy, just not eased. The trade-off is documented in the file header as “polish improvement, not a V1 ship blocker.” The cheapest dependency is the one you didn’t add.

The escape hatch is the gate. The Closes #N workflow has to admit meta-PRs that legitimately don’t close an issue (a one-off CI fix, a doc tweak). Without [skip-issue], every closeout PR would need a synthetic issue. The escape hatch is not a cop-out; it’s what makes the convention serviceable. Same shape as predb:migrate accepting both the exact placeholder string and empty files — the gate is sharp where violations matter and forgiving where forcing rigidity would manufacture friction.

TypeScript narrowing across async closures comes back every time withAudit returns a value. Same shape as S13 PR #164’s chat-thread userId hoist. Inside withAudit({...}, async () => { factId = await withTenant(...) }), the closure can’t make let factId: string strictly initialized from outside. Resolution every time: assign inside, return outside. The pattern is generalized enough that I added it to the test report’s “Issues caught” — when a withAudit body needs a return value, hoist a let outside, assign inside, return after. Future me will hit this; it’s already happened twice and will happen again as withAudit wraps grow.

Source propagation is the V1.5 hook. extractAndPersist and autoWriteExtractedFact gained an optional source: AuditSource = "ui" arg. Today every caller is the upload route, so the default keeps the existing call green. But when email-ingestion eventually drives extraction (V1.5), it can pass source: "connector_gmail" and the audit row will be honest about where the document originated — no second pass through these functions, no breaking signature change. The optional-default-with-explicit-call-site-future-readiness pattern is doing more work in this codebase than I realized. Same shape as documentCount being part of KnowledgeGraphNode even though it’s always 0 in V1: the shape is V1.5-ready, the wiring waits.

Where Sprint 15 picks up

The natural primaries:

Live perf measurement of the graph canvas at dogfood scale. Spec §2.9 set numeric budgets; the implementation is structurally right but unmeasured. Chrome DevTools Performance trace, record actuals in the spec doc’s measurements column. Cheap, ~30 min once a populated tenant exists. Becomes a higher priority if any §2.9 number misses.
First mobile UX delta — camera capture. The bigger of the two remaining V1 mobile deltas (table-to-card shipped S12, list-tree shipped S14 as part of graph viz). Camera capture is what makes “snap the bill on your fridge and Domi knows about it” feel real.
Regression-suite backlog #3-#8. S14 shipped #1 and #2. The next-cheapest from the ranking are the catalog-no-latest-aliases test (~10 min, one walk of CATALOG asserting every id is a dated string) and the OpenAI known-regressions snapshot (~30 min). Both could ride along as secondaries.

Carry-forwards still alive: WCAG browser-verification run (#150, JF runs); non-functional gates (threat-model sign-off, PIA, DR runbook — all V1-ship-blocking, all paperwork-shaped, none started); Sprint 8 carry-forwards (Gemini + OpenRouter; parent-child cost grouping).

The pattern I’m taking from S14 — and it’s a continuation of the S11/S12/S13 through-line, not a new thing — is the most durable artifacts encode rules that had cost something to violate. CLAUDE.md is a graveyard of “we should remember to” lines. The ones that graduate to code (the withAudit wrapper, the prebuild date check, the predb:migrate guard, the Closes #N workflow) don’t need to be remembered, because the code remembers for you. The next sprint’s most useful artifact is probably the rule I’m still relying on memory or convention to enforce.