2026-05-11

Sprint 11 — paperwork day

Sprint 8 ended on “sometimes the sign that an abstraction is right is that the work to use it for the second time is faster than the work to build it for the first.” Sprint 9 ended on “using a thing tells you what’s missing in a way no spec can.” Sprint 10 ended on “sprints that ship debt off the books matter more than sprints that ship new features.” Sprint 11 ended on something I’m starting to recognize as a pattern: process gaps that don’t matter at low volume become real once enough work piles up to make them visible.

Five planned issues shipped — Terms + Privacy pages, per-day cost sparkline, the chat-roundtrip flake fix, the public-surface review checklist, the first-pass WCAG audit. Plus a deploy-fix follow-up on the legal pages PR (a <Suspense> boundary around the LocaleSwitcher because pure-static pages with useSearchParams() need it). Plus a UX bug surfaced from live use (the user-pill locale picker and Settings → Language wrote to different stores). Plus a project-board cleanup that revealed I’d been silently shipping work without updating the project board for eight sprints because no convention forced it. The bookkeeping was the more interesting story.

What shipped

Terms of Service + Privacy Policy at /{en,fr}/{terms,policy}. Phase 10 launch surface per Dev Plan v0.3. Shared _legal/_legal-page.tsx chrome (brand mark, ”← Back” link, footer with LocaleSwitcher). Per-locale content modules in _content/{en,fr}.tsx — JSX rather than markdown because legal copy is multi-paragraph with inline emphasis and isn’t a fit for the per-string i18n JSON. ~600-700 words per page across 11-13 sections each. Quebec governing law; Loi 25 phrasing for the rights section in French. Sub-processor list in the Privacy Policy is normative — must stay in lockstep with the actual tech stack on every change. Footer links on the public landing + Terms/Privacy entries in the user-pill menu (between Settings and the language submenu) so both kinds of users have a path to the legal pages. “Last updated: 2026-05-10” inline so version control of the copy is explicit; counsel review before public V1.
Per-day cost sparkline in Settings → AI usage. The §4.3 spec called for a per-day axis I’d punted to V1.5 in PR #130; this sprint added it. The aggregate already returned per-(role, model); a second SELECT inside the existing withTenant block adds the calendar dimension via date_trunc('day', occurred_at). New ~140×30 inline SVG sparkline next to the running total — no axes, no labels, just shape. Zero-fills sparse days across the window so a quiet day reads as a dip rather than disappearing. Two new integration tests (one for byDay ordering + sum-equals-total sanity, one extension of the empty-window assertion).
The chat-roundtrip.test.ts flake. Same-batch INSERT was writing both messages with identical createdAt; the ORDER BY createdAt then returned them in undefined order, ~1 in 5 runs failed. Fix: explicit createdAt: t0 and t0+1ms per row. Two-line patch once root cause was clear. Worth noting it’s the second occurrence of the same bug class in two different codebases — the blog pubDate duplicate from Sprint 9 closeout was the same shape: sort fell back to undefined order because two values matched.
Public-surface review checklist. Same procedural-promotion pattern as Sprint 10’s --custom migration guard. The Sprint 5 stale copy lived on the public landing for five sprints; the cause was that no automated gate reviews public-facing copy. Added a checklist line to docs/testing/README.md sprint-template under “Issues caught”: did you actually open /en and /fr this sprint? The next eight closeouts will be asked the question.
First-pass WCAG 2.2 AA audit. Code-level review across the four Domi surfaces (public landing, sign-in, chat, Settings) plus the legal pages. Punch list output to docs/wcag-audit-s11.md. Two P1 items must close before V1 ships (the text-neutral-600 color used in muted footer/label text fails the 4.5:1 contrast floor at ~3.8:1 on bg-neutral-900; the sign-in error <p> lacks role="alert" so screen readers miss the failure announcement). Five P2 items target Sprint 12 (form-hint aria-describedby, user-pill focus management, language-submenu role correction, chat tool-call parts leaking to aria-live, Settings heading levels). Five P3 items are V1.5. ~3 hours of total V1 burn for P1 + P2. Browser-side verification (axe DevTools, keyboard-only walkthrough, VoiceOver on chat) is still pending — code review catches what code review can; the rest needs a real browser session.

The drive-by arc:

Deploy fix on PR #141. Vercel’s first deploy of the legal pages failed with useSearchParams() should be wrapped in a suspense boundary at page "/[locale]/policy". Root cause: LocaleSwitcher reads useSearchParams() to preserve query strings across locale flips; Next.js’s static prerender requires that hook to live inside a <Suspense> boundary so the build can defer search-params reads to the client. The legal pages have no dynamic data — they’re purely static and Next.js fully prerenders them, which trips the rule. The public landing got away with it because await auth() makes it implicitly dynamic. Fix: wrap <LocaleSwitcher /> in <Suspense fallback={null}> inside _legal-page.tsx. Logged with a comment so this doesn’t bite the next pure-static surface.
Locale-sync UX bug. User-reported, surfaced from live use: picking a locale in the user-pill menu didn’t update Settings → Language; picking in Settings didn’t apply to the UI. Each control wrote to a different store — the user-pill changed only the URL via Link href, Settings wrote only to users.language_pref via a server action. Two surfaces, two stores, no synchronization. Fix in PR #145: route locale becomes the single source of truth; both controls now save the pref AND navigate; both read useLocale() for active state. Settings’s “Use route locale (default)” radio gets dropped (dead UI — language_pref didn’t actually override the route, so the option was confusing). Save/Cancel buttons dropped too — autosaves on radio click since the navigation has to happen for the UI to reflect anyway.
The project-board cleanup. Cleanup pass on https://github.com/users/jfgailleur/projects/1. Found 56 items with no Status field set at all — every Phase 2-8 issue from Sprint 1 onwards. Plus 11 GitHub issues stuck open despite their PRs having merged. Root cause for both: PR descriptions used (#N) in the title (readable to humans, ignored by GitHub automation) but never the Closes #N keyword that GitHub auto-acts on. Eight sprints of accumulated drift. Closed the 11 stale-open issues; bulk-moved 56 project items to Done. Added the convention to CLAUDE.md §11.

What surprised me

The legal copy was the slowest part of the sprint. I’d expected the Terms + Privacy pages to be a paperwork sprint and braced for it; what I underestimated was how much time goes into the second-pass through legal copy once the structure is in place. Drafting the section list takes 20 minutes; getting the wording right takes another 90. The trade-off was deliberate — “good enough for V1 dogfood, not consumer-grade legal” per the dev plan, but even “good enough” requires a real read of every paragraph because legal copy is the kind of thing where one wrong sentence creates real exposure. The Quebec-specific bits (Loi 25, governing law, age 14 digital consent) are not optional. Sub-processor list is normative and must match the tech stack — adding that as a written rule in the file’s comment so a future “we added a new SDK” PR doesn’t silently invalidate the policy.

Same bug class, two codebases, four sprints apart. The blog’s pubDate duplicate (Sprint 9 closeout, jfgailleur-blog) and the chat-roundtrip.test.ts flake (Sprint 11, this sprint, Domi) are the same bug. Two values with the same default timestamp; sort falls back to undefined order; the test or the homepage produces inconsistent output ~1 in 5 times. Both fixes were two lines: insert with explicit non-equal timestamps so the sort is deterministic. The lesson isn’t that this bug is sneaky — it’s that the bug is a category, and I should treat it that way. Any test that asserts ordering against a default now() timestamp is suspect by construction. Adding to my checklist: when reading a sort-test, the first question is “could two values be equal here?” Same shape as the migration footgun — once a class is identified, the prevention is “always ask the question,” not “remember the specific case.”

The project board had been silently drifting for eight sprints. When I opened the project page to review the V1 burn-down, what I saw was wrong. Forty-something items showed as “no status assigned” — they weren’t in Backlog, weren’t in Sprint, weren’t in Done, just floating. Eleven GitHub issues that I’d been treating as closed in my mental model were actually open. None of this was visible from the merge log or the test reports — the project board lives in a parallel reality, and unless I look at it directly, I’m not pinged about it. The root cause was convention vs. behavior: the convention I’d assumed (the issue closes when the PR merges) only fires if the PR description includes Closes #N. My PR descriptions named issues by (#N) in the title — readable to humans, ignored by GitHub automation. Eight sprints of (#N) titles compounded into 56 items needing manual cleanup. The fix is the same shape as the others: write the rule down. CLAUDE.md §11 now requires Closes #N in every PR description. The cost of skipping the keyword is borne later in manual board cleanup; making it a written gate avoids the cleanup entirely.

The Suspense gotcha is a real thing now. useSearchParams() is an unbounded read for static prerender. Pure-static pages that include any client component using that hook need a Suspense boundary or they fail the build. The public landing got away with it because await auth() makes it implicitly dynamic; the legal pages were the first fully-static surface that crossed the wire. First time you hit a Next.js rule, you fix it; second time, you put it in the comment. The fix landed in _legal-page.tsx with an explanatory comment so the next pure-static route doesn’t have to rediscover the same lesson.

Three sprints, fifteen PRs, zero rollovers. Sprint 9 was two evenings + scope expansion. Sprint 10 was two evenings of pure throughput. Sprint 11 was a single day. The pace is real but I’m tracking it carefully — Sprint 12 has audit-log search UI (substantial work, RLS + search UX), the WCAG P1 + P2 punch list (~3 hours but all polish work, easy to underestimate), and the start of mobile UX deltas. The clean three-sprint streak is partly a consequence of all the items being right-sized; that’s not guaranteed to repeat.

Where Sprint 12 picks up

Phase 10 continues. The WCAG P1 punch-list is highest priority — both items block V1 ship per the audit. Two items, ~30 minutes total: bump text-neutral-600 to text-neutral-500 (or audit per-file usage and demote the few remaining places it’s correct) and wrap the sign-in error <p> in role="alert" (two occurrences). The WCAG P2 list then takes ~2-3 hours: form-hint aria-describedby, user-pill focus management, language-submenu role correction, chat tool-call parts leaking to aria-live, Settings heading levels. After the code-side fixes land, the browser-side verification pass (axe DevTools + keyboard-only walkthrough + VoiceOver on the chat input) goes into a follow-up audit doc.

The bigger Sprint 12 candidate is the audit-log search UI for owners. RLS-scoped query against the INSERT-only audit.tenant_events view, search-by-actor + filter-by-action + paginated results. Genuinely new surface for the app — the paginated-list pattern hasn’t been established yet. Probably the right primary for a focused S12 sprint.

A decision point lands at S12 start: knowledge-graph viz is cuttable per Dev Plan §10. Lowest-stakes V1 cut. The argument for shipping is “the spec is in v0.10 and JF would use it”; the argument for cutting is “the V1 dogfood window matters more, and graph viz is a polish item that doesn’t affect whether the predictive engine works.” I’m leaning toward cut and revisit at V1.5; will commit at sprint planning.

Sprint 8 carry-forwards still alive: Gemini + OpenRouter adapters cuttable (S18 decision), parent-child cost grouping pending chat-through-adapter routing, the crm_optins Resend wiring + token-backed unsubscribe deferred to V1.5.

The thing I keep coming back to in these closeouts is that the rules I’m finding that need writing down were always the rules — the migration footgun was always going to happen if I forgot, the public landing was always going to leak internal sprint state if nobody looked, the project board was always going to drift if PRs didn’t say Closes #N. The build doesn’t fail at low volume because there’s nothing to compound. Solo bookkeeping is fine until it isn’t, and “until it isn’t” is invisible until you look. The right answer isn’t more discipline; it’s making the discipline a rule the system itself asks for.