max / makenotwork

chore: move todos + audit_review to private layer (gitignored)

Author: Max Johnson <me@maxj.phd> · 2026-06-05 00:20 UTC

Commit: 20d1f70a152429b997531ecb5b5cf8cfbdad7220

Parent: 6a7f0dc

5 files changed, +4 insertions, -1146 deletions

M .gitignore +4

			@@ -46,3 +46,7 @@ mutants.out*
46	46
47	47		# Claude Code instructions (project-local; not for the public repo)
48	48		CLAUDE.md
	49	+
	50	+	# Private working files — live in _private/, synced via Syncthing
	51	+	todo.md
	52	+	audit_review.md

D sando/todo.md -242

		@@ -1,242 +0,0 @@
1	-	# Sando TODO
2	-
3	-	Open work only. Completed items move to `todo_done.md` (sibling file) when one exists. Design notes go in `plans/<name>.md`, not folded into checkboxes.
4	-
5	-	Format rule: every actionable line is a `- [ ]` checkbox. Headings group phases and themes; do not put status updates in them.
6	-
7	-	## Resume here (next session)
8	-
9	-	User-blocking before anything else:
10	-
11	-	- [ ] Apply updated Tailscale ACL (`_private/infra/tailscale-acl-policy.json`) at https://login.tailscale.com/admin/acls — adds the `tag:server → tag:server as user max` SSH rule needed for offsite backup sync. Once live, Claude can finish: verify `makenotwork@alpha-west-1 → max@astra` ssh works, scp `MNW/server/deploy/sync-backup-offsite.sh` to `/opt/makenotwork/sync-backup-offsite.sh`, chmod +x, then trigger `sudo -u makenotwork /opt/makenotwork/backup-db.sh` and confirm a file lands in `max@astra:/opt/backups/mnw/`. Closes the "offsite broken" adjacent fire.
12	-
13	-	Claude-only follow-ups (no user input needed; pick the next slice):
14	-
15	-	- error-pages bake-into-binary via `include_dir!` (separate MNW PR) — closes Phase 3 §2 long-term
16	-	- `cargo_test` gate red on MNW (Phase 0 follow-up) — diagnose, likely needs DB/env setup hook per test or `--test-threads=1`
17	-	- Sandod build/test output streaming (Phase 0 follow-up) — pipe stdout to per-run log files instead of `Output` buffer; surface in WS `/events`
18	-	- Phase 6 monitoring + alerting — Prometheus counters + alert rules
19	-	- Phase 4 prep — first Sando-only deploy to testnot (needs Track B — see below)
20	-	- Sando test suite — see "Testing" section below; sandod and TUI have zero unit/integration tests today
21	-
22	-	Session 5 — 0.9.7 launched 2026-06-03 via Sando through host → A → B (hotfix=true, skip-burn-in). Soak cleanup closed (launchplan_final §1). Remaining:
23	-
24	-	- [x] Soak cleanup eligible 2026-06-10 — shortened and shipped 2026-06-03. Gate verified clean since the 06-03 02:53 migration boot. Removed `/opt/git` (99M, stale duplicate of `/var/lib/mnw/git`), `/opt/makenotwork` (177M, post-yara-relocation), `/opt/backups` (277M, root pg_backup output). 553M reclaimed. yara-rules relocated from `/opt/makenotwork/yara-rules` → real `/opt/mnw/yara-rules` (733 rules compiled fine from new path).
25	-	- [x] Backups rebuilt under `/var/lib/mnw/backups/<db>/` (makenotwork + multithreaded, per-DB subdirs), per-user crons (03:00 + 03:05), offsite to astra `/opt/backups/mnw/<db>/` via Tailscale SSH `tag:prod → max@tag:testing` rule. `backup-puller` rrsync re-rooted at `/var/lib/mnw/backups`; sando `backup.source` updated to `ssh://backup-puller@alpha-west-1:2200/makenotwork/latest.sql.gz`; `/backup/fetch` verified 38MB matched prod size.
26	-	- [x] Pre-existing meta.git ownership drift fixed inline — `mnw-cli:git` → `git:git` (tightened `safe.directory` was rejecting it). Surfaced by post-rm ls-remote regression test.
27	-	- [ ] Remove live drop-in `/etc/systemd/system/mnw-cli.service.d/fhs-git-path.conf` on prod. The unit file in `mnw-cli/deploy/mnw-cli.service` is patched to include `ReadWritePaths=/var/lib/mnw`, so the drop-in becomes redundant next time `./mnw-cli/deploy/deploy.sh --config` runs. Until then both apply (harmless dupe).
28	-
29	-	Decision-gated (needs user input first):
30	-
31	-	- Track B testnot live-app: postgres role+db (Claude), `.env` secrets (which Stripe/SMTP/S3 creds to use for staging — needs user), Caddyfile + Cloudflare Origin CA cert for testnot.work (user issues cert in CF dashboard; Claude installs)
32	-	- Restart-warning hook for prod tier (Phase 5) — needs `CLI_SERVICE_TOKEN` accessible to sandod
33	-
34	-
35	-
36	-	## Testing
37	-
38	-	Sando has zero automated tests today — daemon + TUI have been validated by running real scenarios end-to-end. Worth a pass before relying on it for prod cutover.
39	-
40	-	### TUI hands-on (Phase 5 acceptance — run interactively)
41	-
42	-	- [ ] Launches against `SANDO_DAEMON=http://100.103.89.95:7766` without crashing; header shows daemon URL.
43	-	- [ ] WS status: `ws ok` appears in the header within ~1s of launch (sandod is reachable).
44	-	- [ ] WS reconnects: `sudo systemctl restart sandod` on fw13; header flips `ws ok → ws ... → ws ok` within ~5s. Events resume.
45	-	- [ ] `↑/↓` and `j/k` move the row highlight through all 4 tiers; selection persists across the 2s state refresh.
46	-	- [ ] `b` triggers backup fetch: status bar shows `[ok] backup/fetch: ...`, events log gets a `backup_fetched` line a moment later.
47	-	- [ ] `c` on tier `a` (which has `current_version=0.8.12`) records a manual_confirm; event appears.
48	-	- [ ] `c` on tier `mm` (no current_version) returns an HTTP error; status bar shows `[err]`.
49	-	- [ ] `p` on tier `a` (assuming gates pass) issues a real deploy; sequence of `deploy_start → deploy_ok → promote_complete` events appears.
50	-	- [ ] `R` on tier `a` rolls back to `previous_version`; `rollback` event appears. Reverse with `p` again.
51	-	- [ ] `q`, `Esc`, `Ctrl-C` all quit cleanly; terminal restores correctly (no leftover raw mode).
52	-	- [ ] Events ring buffer trims to 200: trigger ≥200 events (loop /backup/fetch), confirm the oldest scroll out, no panic.
53	-	- [ ] Action while disconnected: kill sandod, hit `b`. Status shows error, TUI stays responsive.
54	-
55	-	### Sandod unit + integration tests (Claude-only)
56	-
57	-	55 tests passing as of 2026-05-31 (14 TUI + 41 daemon). Remaining gaps:
58	-
59	-	- [x] `gates::reset_scratch` — verifies dropping every non-system schema (planted `foo` + `tower_sessions`, ran reset, asserted only `public` remains). Gated by `SANDO_TEST_PG_URL` env var so it skips on hosts without postgres. Run on fw13 with `SANDO_TEST_PG_URL=postgres:///sando_scratch?host=/var/run/postgresql cargo test`.
60	-	- [x] `deploy::deploy_local` — copies multiple binaries (`PRIMARY`/`ADMIN`), swaps symlink atomically across two consecutive deploys, gc_local_releases keeps last N by mtime + handles missing dir + noop under threshold. `sh_quote` round-trip.
61	-	- [x] `deploy::deploy_remote` failure path — against unroutable `192.0.2.1`, verifies clean ssh-attributed error (no panic / hang); ConnectTimeout bounds the test wallclock to ~10s. Plus `deploy_node` with `ssh_target="local"` short-circuits to symlink swap.
62	-	- [x] `backup::fetch` URL parsing — extracted `parse_source` → `BackupSource` enum. 10 tests: file://, rsync://, ssh:// with/without port, multi-segment ssh path, non-numeric `:foo` colon treated as part of host (not port), and all malformed-input rejections (empty, scheme-only, ftp, no path on ssh, empty user@host).
63	-	- [x] `events::emit` no-subscribers no-op; `emit_reaches_a_subscriber`; envelope serializes with flat `kind` field (locks the WS/TUI contract); `lagged_subscriber_observes_recv_error_lagged` exercises broadcast capacity.
64	-	- [ ] `events_ws` handler end-to-end — drive WS through a slow client, assert `{"kind":"lagged",...}` frame arrives. Possible (bind axum to ephemeral port + tungstenite client) but the bus-level lag detection is already locked in by `lagged_subscriber_observes_recv_error_lagged`. Diminishing returns vs effort. Deferred.
65	-	- [ ] `build` mutex behavior — requires real cargo or a slow stub. Treated as a manual checklist item under "TUI hands-on" instead. (Already validated by hand 2026-05-31.)
66	-	- [x] `routes::confirm` — rejects when tier has no `current_version` (409 Conflict — surfaced that GateBlocked maps to 409 not 400, locked in), accepts + inserts a passing gate_runs row when set, 404 on unknown tier.
67	-	- [x] `routes::promote` — refuses promote-to-first-tier (409), errors when neither body nor predecessor has a version, 404 when explicit version's `versions` row is missing.
68	-	- [x] `unsatisfied_gates` — 6 tests: empty, failed-kind flagging, latest-row-wins (red→green flap clears), hotfix skips burn_in only, ignores other tiers/versions, null `passed` treated as failing (locks the in-flight-race safety property).
69	-	- [x] `run_migrator` errors on missing migrations dir.
70	-	- [x] sqlx migrations exercised via existing `sync` tests.
71	-
72	-	### End-to-end harness
73	-
74	-	- [ ] Single-binary smoke: spin up sandod against tmpdir config + a tmp postgres; push a fixture commit; assert the full pipeline (build → gates → MM tier_state advance) completes in under 30s. Run on CI for every sando PR.
75	-	- [ ] Pre-cutover dry run: stand up a throwaway tier-B node, point production-shape config at it, run `cargo_test → migration_dry_run → boot_smoke → promote` end to end. Use existing testnot for this once Track B is done.
76	-
77	-	### TUI unit tests
78	-
79	-	- [x] `format_event` — golden tests for build_ok, gate_done (pass+fail), backup_fetched, deploy_failed, unknown kind, malformed JSON.
80	-	- [x] `ws_url_from`: `http://` → `ws://`, `https://` → `wss://`, only replaces scheme once, unknown scheme passes through.
81	-	- [x] `Action::Display` impl produces `backup/fetch`, `promote/<tier>`, etc.
82	-	- [x] `Shared::push_event` ring-buffer cap at 200; oldest entries drop in FIFO order.
83	-	- [x] `truncate` short-string passthrough vs long-string ellipsis.
84	-
85	-	---
86	-
87	-	Roadmap target: replace `server/deploy/deploy.sh` and astra-hosted `server/deploy/run-ci.sh` with Sando running on fw13, gating Hetzner prod through testnot.work.
88	-
89	-	Host decision: Sando runs on fw13 (x86_64 Ubuntu-derived, systemd). Architecturally closest to Hetzner prod, no cross-compile, no init-system split. MakeMachine and EveryCycle are now a separate project — not Sando's concern.
90	-
91	-	Phases are ordered for execution. Phase 0 must finish before Phase 1 is meaningful. Phases 5+ are post-cutover hardening.
92	-
93	-	## Key Paths
94	-
95	-	Read these to orient before working on Sando:
96	-
97	-	- `README.md` — quickstart, API surface, v0 limitations
98	-	- `sando.toml` — current topology (host → A → B; C declared, not provisioned)
99	-	- `daemon/src/main.rs` — startup sequence (config → topology → migrate → sync → bare-repo bootstrap → serve)
100	-	- `daemon/src/routes.rs` — `/state`, `/promote`, `/rollback`, `/rebuild`, `/backup/fetch`, `/events`
101	-	- `daemon/src/gates.rs` — gate runners; the load-bearing logic
102	-	- `daemon/src/build.rs` — host-tier build pipeline
103	-	- `daemon/src/deploy.rs` — `deploy_local`; remote SSH stub
104	-	- `daemon/migrations/001_init.sql` — schema (tiers/nodes as rows)
105	-	- `server/deploy/deploy.sh` — current cross-compile + push-to-Hetzner script (what we are replacing)
106	-	- `server/deploy/run-ci.sh` — current astra CI script (what we are replacing)
107	-	- `_meta/docs/operations.md` — burn-in rule and hotfix policy that gates encode
108	-
109	-	---
110	-
111	-	## Phase 0 — fw13 bootstrap
112	-
113	-	- [x] Provision `sando` system user on fw13; lock down home dir; generate SSH keypair at `/srv/sando/.ssh/id_ed25519` for outbound deploys.
114	-	- [x] Install scratch Postgres locally on fw13; create `sando_scratch` role + DB used by `migration_dry_run`. (Owner of own DB; non-superuser.)
115	-	- [x] Write systemd unit for `sandod` (long-run service, restart on failure, env from `/etc/sando/sando.env`). Installed at `/etc/systemd/system/sandod.service`.
116	-	- [x] Write the production `sando.toml`; bare repo path under `/srv/sando/mnw.git`. Installed at `/etc/sando/sando.toml`; daemon config at `/etc/sando/sando-daemon.toml`.
117	-	- [x] Install `sandod` binary at `/usr/local/bin/sandod`; enable + start the service. Live on `100.103.89.95:7766`; bare repo auto-bootstrapped at `/srv/sando/mnw.git`.
118	-	- [x] Verify MNW server builds reproducibly on fw13. `makenotwork` 0.8.12 built in 132s; sqlx online mode against `sando_scratch` postgres (sandod prep-resets all non-system schemas + applies all 133 MNW migrations before invoking cargo).
119	-	- [ ] Register sando pubkey with Hetzner prod (`deploy@alpha-west-1`) and testnot.work once that node exists. Pubkey: `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIEK+vhpr1V8VnsEemN9x6tAA2S05kmv/mQ3eVgSXSkJ8 sando@fw13`. (Moved to Phase 1 — not blocking Phase 0 exit.)
120	-
121	-	### Phase 0 follow-ups (not blocking, but visible)
122	-
123	-	- [ ] `cargo_test` gate fails on MNW today — beyond the sqlx-online fix (already in), tests likely need a separate prepared DB (or per-test isolation). Investigate when wiring up Phase 1 gates.
124	-	- [ ] Sandod observability: add `WS /events` (Phase 5) and consider streaming build/test stdout to a per-run log file rather than buffering in `Output`.
125	-	- [ ] sqlx-cli (`v0.9.0`) at `/srv/sando/.cargo/bin/sqlx` is installed for the sando user but unused — sandod uses `sqlx::migrate::Migrator` programmatically (v0.8.6). Decide later whether to drop sqlx-cli or use it for diagnostics.
126	-	- [ ] fw13 WoL: `ethtool` shows no wake-on capability on the USB ethernet — WoL likely won't work; rely on manual wake or BIOS settings. Record in `_meta/` if a solution surfaces.
127	-
128	-	## Phase 1 — Remote deploy
129	-
130	-	The MVP only deploys to `ssh_target=local`. Production needs real SSH/rsync.
131	-
132	-	- [x] Implement `deploy::deploy_node` remote path: rsync staged binary to `<ssh_target>:<release_root>/releases/<version>/<bin_name>`, then `ssh <ssh_target>` does `mv -Tf` symlink swap + `sudo systemctl reload-or-restart <service>`. First real promote landed 2026-05-31: fw13 → testnot, version 0.8.12.
133	-	- [x] Add `node.service_name` to `sando.toml` (default `makenotwork.service`).
134	-	- [x] Bootstrap script for adding a fresh node: `MNW/sando/deploy/bootstrap-node.sh`. (See Phase 3 — node-bootstrap script for full details.)
135	-	- [x] Garbage-collect old releases on the remote: keep last N=5 per node, sorted by mtime. Runs at end of each successful deploy (local + remote variants). Tied via `RELEASES_TO_KEEP` const.
136	-	- [x] Handle `rsync` failure mid-deploy: leave the previous `current` symlink intact; mark `deploys.outcome = 'failed'`; do not advance `tier_state`. (Verified the routes.rs path; rsync runs before symlink swap so failure naturally leaves `current` untouched.)
137	-
138	-	### Phase 1 — Track B: testnot live-app setup (NOT blocking Phase 2)
139	-
140	-	Sando's deploy machinery is done, but testnot's MNW runtime needs the rest before its `makenotwork.service` can stay up:
141	-
142	-	- [ ] Provision `makenotwork` postgres role + db on testnot (postgres-18 already installed).
143	-	- [ ] `/opt/mnw/.env` with staging Stripe keys, SMTP, S3, DATABASE_URL, all other MNW env. Decide which subset of integrations get test/sandbox credentials vs are stubbed.
144	-	- [ ] Caddyfile for testnot.work — strip prod's blocks down to just the main reverse_proxy (and forums/cdn if needed). Cloudflare Origin CA cert for testnot.work issued + placed at `/etc/caddy/`. AOP CA already universal.
145	-	- [ ] `error-pages/` for testnot (copy or symlink from a release dir).
146	-	- [ ] Wire post-deploy smoke check (`curl https://testnot.work/health` after the symlink swap, before declaring deploy ok). Sando-side, gate-like; spec in Phase 2 boot_smoke wording.
147	-
148	-	## Phase 2 — Backup pipeline + migration dry-run
149	-
150	-	`migration_dry_run` is the load-bearing gate. It needs a real backup source, not a fixture.
151	-
152	-	- [x] ~~Confirm astra's offsite replica writes a deterministic latest-link path.~~ Pivoted: pull direct from prod (`backup-puller@alpha-west-1:2200`, rrsync-locked to `/opt/makenotwork/backups/`). Astra offsite is separately broken — see carryover below.
153	-	- [x] Wire the production `sando.toml` `backup.source` — `ssh://backup-puller@alpha-west-1:2200/latest.sql.gz` with `latest.sql.gz` as a hard link on prod.
154	-	- [x] Schedule a daily `POST /backup/fetch` (systemd timer on fw13). `sandod-backup-fetch.{service,timer}` in `MNW/sando/deploy/`. Runs daily at 04:00 UTC (one hour after prod's 03:00 UTC backup-db.sh). Service uses `EnvironmentFile=/etc/sando/sando.env` for `$SANDO_DAEMON`. Verified 2026-05-31: one-shot test pulled 36MB backup, recorded in `backups` table.
155	-	- [x] First end-to-end `migration_dry_run` against a real prod backup. Passed 2026-05-31 for sha 4541ebc in 1.2s: restored 36MB dump + applied all 133 migrations cleanly. Sha eee96a7 correctly failed `migration_dry_run` because it lacked migrations 123-132 that prod has applied — exactly the prod-vs-repo drift the gate is designed to catch.
156	-	- [x] Document the failure modes: `plans/migration-dryrun-failures.md`. Covers all 7 fail modes (no backup, scratch_url unset, scratch reset, restore, drift, checksum mismatch, content broken against prod data) with operator playbook.
157	-	- [x] Decide retention on `backups` table. 30 days; pruned at end of `backup::fetch`. `DELETE FROM backups WHERE fetched_at < datetime('now', '-30 days')`.
158	-
159	-	### Phase 2 carryovers / adjacent fires
160	-
161	-	- [ ] Offsite backup sync from prod → astra still broken. Diagnosed 2026-05-31: `sync-backup-offsite.sh` was never deployed to prod (`deploy.sh` gap when it was added). `makenotwork@prod` had no SSH key. Generated key + installed pubkey on `max@astra:~/.ssh/authorized_keys`, created `/opt/backups/mnw` on astra. Blocked on Tailscale ACL: astra runs only Tailscale SSH (no regular sshd on a bypass port), and the ACL denies `tag:tagged-devices` (alpha-west-1) → astra as user `max`. Needs ACL update in the Tailscale admin console, then deploy `sync-backup-offsite.sh` to `/opt/makenotwork/` and test. Makenotwork@prod pubkey: `ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILzyQQ7pmBIZat8fABlpG/opwh4w5GhLIfkX2qxKxuT0 makenotwork@alpha-west-1`.
162	-	- [x] Prod backup `latest.sql.gz` hard link. `backup-db.sh` now maintains `latest.sql.gz` atomically (`ln -f $LATEST.new && mv -Tf .new latest.sql.gz`). Deployed 2026-05-31; manual run verified (nlinks=2).
163	-
164	-	## Phase 3 — Parity with current `deploy.sh`
165	-
166	-	Decisions captured in `plans/config-artifacts.md`. Summary: Caddyfile / systemd unit / backup script / security configs all move to one-time node-bootstrap, not per-deploy. error-pages bake into binary (MNW PR) with sibling fallback. mnw-admin ships alongside server via `bin_names: Vec<String>`. Restart warning is Phase 5, prod-tier-only. Prod migrations: server self-applies on startup (`main.rs:73`), sando does not.
167	-
168	-	- [x] Caddyfile — decided: bootstrap-only. Not per-deploy. (`plans/config-artifacts.md` §1.)
169	-	- [x] systemd unit — decided: bootstrap-only. (§4.)
170	-	- [x] Backup script — decided: bootstrap-only. (§6.)
171	-	- [x] Error pages — short-term done: ship as release-dir sibling. `build_and_run_mm` `cp -a` from `worktree/server/deploy/error-pages/` into the staged release dir; deploy_node's rsync of the whole dir picks it up. Verified on testnot 2026-05-31. Long-term `include_dir!` bake-in still a separate MNW PR.
172	-	- [x] mnw-admin binary — `cfg.bin_names: Vec<String>` (default `["server"]`, MNW uses `["makenotwork","mnw-admin"]`). `deploy_local` copies each from worktree's `target/release/<bin>`; `deploy_node` rsyncs the whole staged dir. `Config::primary_bin()` returns first entry for systemd reference. `versions.artifact_path` stores the primary; release dir is derived as `.parent()`. Verified on testnot 2026-05-31.
173	-	- [x] Security configs — decided: bootstrap-only. (§5.)
174	-	- [ ] Restart warning — Phase 5, prod-tier only via `tier.restart_warning_seconds` in `sando.toml`; needs `CLI_SERVICE_TOKEN` in `/etc/sando/sando.env`. (§7.)
175	-	- [x] Cross-compile from macOS — decided: retire after one sprint of testnot parity verification. fw13 builds natively. (§8.)
176	-	- [x] Prod migrations — decided: server self-applies on startup. Sando does NOT run them. `migration_dry_run` gate is the prod safety net. (§9.)
177	-	- [x] Node-bootstrap script — `MNW/sando/deploy/bootstrap-node.sh`. Idempotent. Takes `SANDO_PUBKEY` (required), `BIN_NAME`, `SERVICE_NAME`, `SERVICE_USER`, `DEPLOY_ROOT` env. Installs base packages (rsync/ufw/fail2ban), optionally postgres/tailscale/caddy, creates deploy user + dirs + sudoers entry + systemd unit, sets up UFW. Deliberately does NOT touch Caddyfile content, certs, postgres role/db, or secrets — those are operator-decisions per-node. testnot was done by hand and matches roughly what the script produces. Test by re-running on the next node added (tier B Hetzner prod move or tier C).
178	-
179	-	## Phase 4 — Cutover
180	-
181	-	Run Sando in parallel with `deploy.sh` until trust is built, then retire the old path.
182	-
183	-	- [ ] First successful Sando-only deploy to testnot.work (tier A). Old `deploy.sh` still primary for prod.
184	-	- [ ] One sprint (two months) of Sando-shadow runs: every `deploy.sh` deploy is also driven through Sando in dry-run mode (gates run, deploys go to a parallel `releases/` dir on prod but don't swap `current`). Compare outcomes.
185	-	- [ ] First Sando-only deploy to Hetzner prod (tier B). `deploy.sh` retained but unused.
186	-	- [ ] Move `server/deploy/deploy.sh` to `server/deploy/archive/deploy.sh.legacy` with a header explaining the cutover; do not delete (reference for the next year).
187	-	- [ ] Decommission astra CI runner (`server/deploy/run-ci.sh`). Sando's `cargo_test` gate replaces it; if any astra-specific checks are still needed (e.g., `cargo audit`), add them as additional gate kinds in `daemon/src/gates.rs`.
188	-	- [ ] Update `CLAUDE.md` and `_meta/docs/operations.md` to point at Sando, not `deploy.sh`.
189	-
190	-	## Phase 5 — Operator UX
191	-
192	-	The TUI polls. The MVP requires you to hand-insert a row for `manual_confirm`. Both are fine for one operator but rough.
193	-
194	-	- [x] Build mutex: single-slot `AppState.active_build: Mutex<Option<AbortHandle>>`; newer `/rebuild` aborts any in-flight build. Cargo commands set `.kill_on_drop(true)` so abort propagates SIGKILL to cargo + rustc children. (Landed 2026-05-31 after observing two concurrent builds racing the scratch DB.)
195	-	- [x] Implement `WS /events`: tail of gate starts/finishes, deploy events, build logs. Event enum in `daemon/src/events.rs`; `broadcast::channel(256)` in `AppState`; emit sites in build.rs, gates.rs, routes.rs (rebuild, promote, rollback, confirm, backup_fetch). Verified 2026-05-31: live JSON envelopes stream to a python `websockets` client.
196	-	- [x] TUI: actions pane. `↑↓`/`jk` select tier; `p` promote (no body — defaults version); `R` rollback; `b` backup fetch; `c` manual_confirm. Action results land in the events log. Daemon URL via `$SANDO_DAEMON`. Built in `tui/src/main.rs` 2026-05-31.
197	-	- [x] `POST /confirm/{tier}` endpoint — inserts `gate_runs` row with `passed=1, gate_kind='manual_confirm'` for the tier's `current_version`. Replaces hand-SQL workaround. Verified 2026-05-31 against tier `a`.
198	-	- [x] TUI live log pane that follows the most recent build / gate run; backed by `WS /events`. 200-event ring buffer, human-formatted per kind. WS auto-reconnects every 3s. Header shows ws connection state.
199	-	- [x] `POST /promote` body — `version` now optional; defaults to predecessor tier's `current_version`. (Unblocks the "promote what just baked" flow.)
200	-
201	-	## Phase 6 — Monitoring + alerting
202	-
203	-	- [ ] Wire fw13 `/metrics` endpoint into the existing MNW Prometheus scrape config; record where the scrape config lives in `_meta/` or wherever monitoring already runs.
204	-	- [ ] Add counters: `sando_builds_total{outcome}`, `sando_gates_total{tier,kind,outcome}`, `sando_deploys_total{tier,outcome}`, `sando_burn_in_remaining_hours{tier}`.
205	-	- [ ] Alert: build failed. Page on first failure (not flap-protected — builds are infrequent).
206	-	- [ ] Alert: migration_dry_run failed. Page immediately. This is the 2026-05-22-class signal.
207	-	- [ ] Alert: a tier has had `current_version` unchanged for > N days while host is green. (Operator forgot to promote.)
208	-
209	-	## Phase 7 — Multi-node B+C
210	-
211	-	Today B is the only prod node. Adding C is the second prod node + CF Load Balancing.
212	-
213	-	- [ ] Provision tier C node (Hetzner or alternate provider — capture rationale).
214	-	- [ ] Update `sando.toml`: set `c.provisioned = true`, add `[[tier.node]]`.
215	-	- [ ] Set up Cloudflare Load Balancing with B + C as origin pool, health-checked.
216	-	- [ ] Verify sequential canary in Sando: deploy to B, wait for CF health-check to mark healthy (probably 30-60s probe interval), then deploy to C. Add a `node.health_url` field and a gate-style wait between nodes.
217	-	- [ ] Document in README that `canary = "parallel"` exists but should never be used for B+C unless you understand the failure modes.
218	-
219	-	## Phase 8 — Postgres-on-D
220	-
221	-	Move Postgres off the prod app node so B+C become truly interchangeable.
222	-
223	-	- [ ] Provision Postgres-only machine D (modest spec; reliability over performance).
224	-	- [ ] Migrate the prod DB from Hetzner app node to D. Capture procedure in `plans/postgres-d-migration.md`.
225	-	- [ ] Update `server` `DATABASE_URL` everywhere (env files on B+C, scratch URL on fw13 stays local).
226	-	- [ ] Replica/HA story stays deferred; D is SPOF for now (per `_meta/preclear/.../decisions.md`).
227	-
228	-	## Phase 9 — Hardening
229	-
230	-	Pick up after cutover is stable.
231	-
232	-	- [ ] Tailnet ACL audit: confirm only the laptop can reach `sandod:7766`. Document the ACL.
233	-	- [ ] Decide if v0.2 needs token auth on `sandod` endpoints (revisit assumption from `decisions.md` once there's a real second operator).
234	-	- [ ] Sando self-deploy: Sando builds and deploys itself through its own pipeline. Bootstraps the bootstrap. Closes the chicken-and-egg loop and is satisfying.
235	-	- [ ] Backup-of-Sando-state: nightly SQLite snapshot to astra. The state DB tracks 6 months of deploys; losing it on a fw13 disk failure would be annoying.
236	-
237	-	## Notes / non-checkbox
238	-
239	-	- WS `/events` and the operator-UX work in Phase 5 can run in parallel with Phase 1-3 once Phase 0 is done. They are sequenced after for review clarity, not because they block anything.
240	-	- "Hotfix override" and `reset_burn_in` flag are already implemented end-to-end (see `decisions.md`); not on this list because there's nothing left to do until prod uses them.
241	-	- C tier exists in the schema as a `provisioned=false` row from day one — adding C in Phase 7 is a TOML edit, not a migration.
242	-	- MakeMachine + EveryCycle are a separate project. The hardware BOM moved to `~/Code/everycycle/docs/hardware/mm-v1-bom.md` on 2026-06-01.

D server/docs/audit_review.md -500

		@@ -1,1192 +0,0 @@
1	-	# Ultra Fuzz Report — MNW Server (Run #9 — launch eve)
2	-
3	-	Run date: 2026-05-31 (evening)
4	-	Run number: 9 (launchplan_final.md §1.5 referred to it as "Run #5" — stale; this is the 9th)
5	-	Trigger: launchplan §1.5 pre-launch pass
6	-
7	-	## Run #9 headline
8	-
9	-	Run #8 closed with "BAR MET — ALL FIVE AXES A-". Run #9 went deeper and surfaced 1 CRITICAL + 4 SERIOUS + several MED/HIGH items the prior 8 runs missed. All four launch-critical items fixed in-session; remaining items deferred with rationale below.
10	-
11	-	\| Axis \| Run #8 \| Run #9 \| Direction \|
12	-	\|------\|--------\|--------\|-----------\|
13	-	\| Payments \| A- \| A- \| flat — 2 new SERIOUS surfaced; 1 fixed (webhook unmark on dual-failure 503), 1 deferred (subscription out-of-order webhook) \|
14	-	\| Storage \| A- \| A- \| flat — 1 new HIGH (migration 129 dead-letter table unused) + 2 MEDs (is_s3_key_live unindexed full-scan, LIKE-suffix false-positive); deferred \|
15	-	\| UX Wiring \| A- → B- → A- \| A- \| dipped on grade-cap for signup TOCTOU CRITICAL, restored after fix \|
16	-	\| Security \| A- \| A- \| flat — 2 new SERIOUS, both fixed (JWT-bump non-atomic, 2FA email IP spoofable) \|
17	-	\| Performance \| A- \| A- \| flat — 2 new HIGH (per-request reqwest::Client::new in 5 hot paths, unbounded spawn in expired-account cleanup); deferred to post-launch \|
18	-
19	-	Net Run #9 (post-fix): 0 CRITICAL · 1 SERIOUS open (Payments subscription ordering — documented deferral) · 3 HIGH open (deferred) · 7 MED open (deferred). Launchplan §1.5 A- bar holds.
20	-
21	-	## Run #9 — CRITICAL fixed in-session
22	-
23	-	### UX-CRITICAL — Signup TOCTOU: race → 500 + form loss → FIXED 2026-05-31
24	-
25	-	`src/routes/pages/public/join_wizard.rs:99-139`. The wizard ran separate `get_user_by_username` / `get_user_by_email` checks before `create_user`. A concurrent signup with the same username or email slipping between SELECT and INSERT raised a 23505 unique violation that bubbled to `AppError::Database` → 500 "Something went wrong" — and the user's entire typed-in form was lost. On a public alpha-launch surge this is the highest-traffic public endpoint; the wrong page to be returning 500s on.
26	-
27	-	Fix landed: `create_user` call site now matches `AppError::Database(sqlx::Error::Database(_))` with code 23505, inspects the constraint name (`users_username_key` / `users_email_key`), and routes through `return_error(..)` with a friendly message — same flow as the explicit pre-check branches. Same shape as the existing 23505 handling in `db/license_keys.rs`, `db/builds.rs`, `routes/api/guest_checkout.rs`.
28	-
29	-	Known follow-up (not blocking): the form-reload still loses typed values on the error swap; `return_error` renders `LoginErrorTemplate` (message-only). Preserving field values would require threading them through the template — file a separate Phase 4 polish item.
30	-
31	-	## Run #9 — SERIOUS fixed in-session
32	-
33	-	### Sec-SERIOUS — `delete_all_sessions_for_user` non-atomic JWT bump → FIXED 2026-05-31
34	-
35	-	`src/db/sessions.rs:247-263`. The function ran `DELETE FROM user_sessions` then a separate `UPDATE users SET jwt_invalidated_at = NOW()` on independent connections. If the UPDATE dropped (pool timeout, conn drop, postgres restart), session cookies were dead but every outstanding SyncKit JWT survived until natural expiry — exactly the leak this function exists to prevent. The in-code comment ("a session row deleted without a JWT bump is harmless, the converse would leak access") inverted reality.
36	-
37	-	Fix landed: both writes wrapped in `pool.begin()` / `tx.commit()`. Comment updated.
38	-
39	-	### Sec-SERIOUS — 2FA login-notification email uses spoofable IP → FIXED 2026-05-31
40	-
41	-	`src/routes/pages/public/two_factor.rs:308-312`. The 2FA-completion path read `x-forwarded-for` raw (first-comma-split) for the new-login email's IP field. Every other login surface (`routes/auth.rs:242`, `auth.rs:486`, `auth.rs:528`) routes through `crate::helpers::extract_client_ip` which prioritizes `CF-Connecting-IP`. An attacker who already captured a password could pre-set `X-Forwarded-For: 1.2.3.4` on the verify-2fa POST so the "new login from <city>" email lied about origin — the exact email users are told to trust for compromise detection.
42	-
43	-	Fix landed: swapped to `crate::helpers::extract_client_ip(&headers)`. One-line change, parity restored.
44	-
45	-	### Pay-SERIOUS — Webhook dual-failure dropped events silently → FIXED 2026-05-31
46	-
47	-	`src/routes/stripe/webhook/mod.rs:73-89`. Dedup row was marked processed before handler dispatch (correct for at-least-once). On `(handler_err, insert_failed_event_err)` dual failure, code returned 503 to trigger Stripe redelivery — but Stripe's redelivery would short-circuit at the dedup check (line 50) and 200 the event without ever processing it. The code's own comment acknowledged the bug; the right tool (`unmark_event_processed`, defined 30 lines away in `db/webhook_events.rs:40`) was never called.
48	-
49	-	Fix landed: call `db::webhook_events::unmark_event_processed(&state.db, &event_id)` before returning 503, with logged-error best-effort if even that fails (same scenario where 503 was already wrong).
50	-
51	-	## Run #9 — DEFERRED with rationale (above A- bar)
52	-
53	-	### Pay-SERIOUS — Subscription webhook out-of-order events resurrect `active`
54	-
55	-	`src/routes/stripe/webhook/subscriptions.rs:90, 116, 140`. Handlers blindly overwrite `subscriptions.status` and `period_end` from the webhook payload. Stripe does NOT guarantee delivery order. Sequence `past_due → active` reordered as `active → past_due → active(stale)` overwrites a legitimate `past_due` with stale `active` — restoring access for a user who hasn't paid.
56	-
57	-	Deferral rationale: worst case is restored access for a few minutes until the next webhook arrives. Fix requires re-extracting Stripe's top-level `created` from `UntypedEvent` (currently dropped) and adding `WHERE last_event_at IS NULL OR last_event_at <= $created` guards on every status/period write across Fan+, creator-tier, and synckit code paths — non-trivial cross-cutting change. Post-launch fix in Phase 4; tracked in todo.md.
58	-
59	-	### Sto-HIGH — Migration 129 dead-letter table never written
60	-
61	-	`migrations/129_pending_s3_deletions_dead_letter.sql` creates `pending_s3_deletions_dead_letter` and documents it as "operator-visible parking lot... require manual triage." `src/scheduler/cleanup.rs:453-457` on `attempts >= 10` only logs `tracing::error!` then removes the row — never inserts into the dead-letter table. Permanently-failing keys have zero operator visibility.
62	-
63	-	Deferral rationale: operational, not runtime. No user impact; only operators lose triage signal. One-INSERT fix; bundle into Phase 4.
64	-
65	-	### Perf-HIGH — Per-request `reqwest::Client::new()` in 5 hot paths
66	-
67	-	`routes/pages/dashboard/main.rs:118`, `routes/pages/public/landing.rs:284`, `routes/api/internal/cli_features.rs:440`, `routes/api/domains.rs:319`, `auth.rs:559`. Each call builds a fresh TCP pool, TLS context, DNS resolver — no keep-alive across requests. `MtClient` in `AppState` already keeps a pooled client; the dashboard bypasses it.
68	-
69	-	Deferral rationale: real but matters at scale. Private alpha launch traffic well below where this becomes a tail-latency contributor. 30-min refactor; bundle into Phase 4 once launch traffic settles.
70	-
71	-	### Perf-HIGH — Unbounded `tokio::spawn` in expired-account cleanup
72	-
73	-	`src/scheduler/cleanup.rs:215-220` (`spawn_expired_account_cleanups`). Daily tick spawns one task per expired account, no governor. `cleanup_sandbox_accounts` (same file, ~100 lines above) correctly caps at `CLEANUP_PARALLELISM=4` via `JoinSet`; the terminated/content-removal variants don't. A backlog of 200 expired accounts fan-outs 200 concurrent S3 prefix listings racing for the 25-conn pool at midnight.
74	-
75	-	Deferral rationale: runs once daily; current expired-account count is small (private alpha). Trivial fix (lift the existing JoinSet pattern); not launch-blocking. Bundle with Phase 4.
76	-
77	-	## Run #9 — MED/LOW deferred (read-only carry-forward, in todo.md)
78	-
79	-	- Pay-MED: `pricing.rs::parse_dollars_to_cents` misinterprets European decimal comma (`1,23` → 12300¢). User-controlled input; fixable in a single regex.
80	-	- Pay-MED: SyncKit app-sub checkout silently defaults `storage_limit_bytes` to 0 if metadata missing.
81	-	- Pay-MED: Guest checkout email falls back to `"unknown@guest"` sentinel; collisions possible.
82	-	- Sto-MED: `is_s3_key_live` runs 7 EXISTS subqueries on unindexed `items.audio_s3_key` / `cover_s3_key` / `video_s3_key` / `versions.s3_key` etc — sequential scans per retry.
83	-	- Sto-MED: `is_s3_key_live` LIKE-suffix pattern `'%' \|\| s3_key` false-positives on neighboring keys (key `abc/file.png` matches `xabc/file.png`) — skips a legitimate delete → S3 object leaks.
84	-	- UX-MED: "Log in" return_to query param in `purchase.html:145` is dead-wired — login handler always redirects `/dashboard`. Lost purchase intent.
85	-	- UX-MED: Admin user filter buttons (`admin-users.html:35-44`) use `class="primary"` / `class="secondary"` instead of `btn-primary` / `btn-secondary` — renders unstyled.
86	-	- UX-LOW: Pagination links in `git/issues.html:72,76` don't URL-encode `search`; `&page=99` in search query corrupts pagination.
87	-	- UX-LOW: 5 sites do `.render().unwrap_or_default()` on Askama templates (blank UI on render failure, no log).
88	-	- UX-LOW: `slugify` in `formatting.rs` produces `"post"` for any non-ASCII title; international creators get opaque URLs.
89	-	- Sec-MINOR: `csrf.rs:176-185` `validate_token_consuming` doesn't consume — name promises stronger property than implementation.
90	-	- Sec-MINOR: `routes/oauth.rs:101-111` `is_localhost_redirect` allows any port on localhost regardless of registered URI.
91	-	- Sec-MINOR: `routes/pages/public/two_factor.rs::pending_2fa_started_at` reads `i64` via session.get; type mismatch silently → None → instantly-expired.
92	-	- Sec-MINOR: `scanning/archive.rs:124` path-traversal check misses lone `..` segment (no trailing separator).
93	-	- Perf-LOW: `scheduler/announcements.rs` linear walk through subscriber list in a single spawned task; no checkpointing.
94	-	- Perf-LOW: `db/page_views.rs` `pending` HashMap has no max-cardinality cap (crawler hitting 100k unique target_ids before tick).
95	-	- Perf-LOW: `build_runner.rs:441` local artifact tmpfile leaks if process crashes between SCP and `remove_file`.
96	-
97	-	## Run #9 — mandatory surprises
98	-
99	-	- Payments: `routes/stripe/webhook/mod.rs:82-89` literally documents the bug it ships ("the dedup row was already marked processed... Stripe won't retry") and then chooses 503 anyway. The fix (`unmark_event_processed`) sat 30 lines away in the same crate, never called. Scar-tissue-comment-without-the-fix is a recognizable pattern across the codebase.
100	-	- Storage: `routes/storage/mod.rs::commit_upload` sealed-helper pattern (Run #7 fix for the chronic disease) is the strongest piece of structural engineering in the repo — turned an enum into a witness type. But the neighbor file `migrations/129_pending_s3_deletions_dead_letter.sql` shows the opposite: migration written with detailed prose explaining the operator's parking lot, and the actual INSERT never wired up. Two adjacent fixes from the same audit-cycle, one structural and load-bearing, one ceremonial and silently broken.
101	-	- UX: `csrf.rs` `PostureMethodRouter` + sealed `CsrfManuallyValidated` witness make registering a mutation route without an explicit posture declaration uncompilable. A+ engineering. The contrast with the signup wizard's TOCTOU-and-500-with-lost-form is jarring — defensive depth on CSRF, none on the front door.
102	-	- Security: `routes/auth.rs:128-130` malformed-email branch skips the DUMMY_HASH timing equalizer that was added explicitly to prevent timing-side-channel user enumeration. ~2 orders of magnitude faster than every other failure path. The equalizer exists; this one path bypasses it.
103	-	- Performance: `db/projects.rs::get_project_ids_for_user` is the only `fetch_all` in `projects.rs` without a `LIMIT`. Its neighbor `get_projects_by_user` caps at 500 with a documented safety comment. Cyber-squatter with 10k projects + account expiry → 10k S3 prefix-deletes in one spawned task. Asymmetric defense within the same module.
104	-
105	-	## Run #9 — stress-tested OK
106	-
107	-	Verified attacks the code survived (high-confidence positives):
108	-
109	-	- Stripe webhook signature replay (HMAC constant-time, multi-secret rotation, timestamp tolerance both directions)
110	-	- Promo code concurrent over-use (single atomic UPDATE with max_uses + expires_at + starts_at)
111	-	- Cart race past pre-check (23505 fallback aborts cleanly without charging)
112	-	- License key prediction (6 wordlist × CSPRNG ≈ 66 bits)
113	-	- Pre-signed URL Content-Length binding (S3 rejects mismatch at protocol level)
114	-	- Storage cap atomicity (`try_replace_storage` single UPDATE)
115	-	- Build claim race (partial unique index + 23505 backstop)
116	-	- Idempotent re-confirms in all 4 upload confirm handlers (reaper-deletes-live-object closed)
117	-	- Session row + JWT atomicity (post-fix verified above)
118	-	- TOTP replay across skew window (matched-step tracked + strict `>` gate)
119	-	- OAuth PKCE downgrade (S256 pinned at authorize + token-exchange)
120	-	- CSRF body bypass via textarea-smuggled token (proper form parser)
121	-	- Git diff/blame XSS (HTML-escaped in attacker-controlled spots)
122	-	- Internal error leakage (tests assert no PG host, no S3 bucket, no sqlx variant leaks)
123	-
124	-	## Run #9 confidence per axis
125	-
126	-	- Payments HIGH (~70% LoC read this pass; Phase 4 backlog visible)
127	-	- Storage HIGH (full module read; cleanup.rs upper half only — MEDIUM there)
128	-	- UX Wiring HIGH for CSRF/error/validation; MEDIUM for wizard step partials, embed routes, dashboard CSV import
129	-	- Security HIGH for auth/CSRF/session; MEDIUM for scanning (YARA rule content unread), API key scoping
130	-	- Performance HIGH for scan worker, scheduler, storage, build_runner; MEDIUM for SyncKit, postmark, import pipeline
131	-
132	-	## Run #9 bug counts
133	-
134	-	\| Severity \| Payments \| Storage \| UX \| Security \| Perf \| Total \|
135	-	\|---\|---\|---\|---\|---\|---\|---\|
136	-	\| CRITICAL \| — \| — \| 1 (FIXED) \| — \| — \| 1 \|
137	-	\| SERIOUS \| 2 (1 FIXED, 1 deferred) \| — \| — \| 2 (FIXED) \| — \| 4 \|
138	-	\| HIGH \| — \| 1 (deferred) \| — \| — \| 2 (deferred) \| 3 \|
139	-	\| MED \| 3 (deferred) \| 2 (deferred) \| 2 (deferred) \| — \| — \| 7 \|
140	-	\| LOW/NOTE \| 2 \| — \| 3 \| 4 \| 3 \| 12 \|
141	-
142	-	## Run #9 delta vs Run #8
143	-
144	-	- 1 CRITICAL surfaced + fixed (signup TOCTOU); class missed by prior 8 runs because no agent explicitly probed the public-signup race window
145	-	- 4 SERIOUS surfaced; 3 fixed in-session, 1 deferred with rationale
146	-	- Run #8 "BAR MET" claim was correct for the surfaces it audited but understated: this pass added explicit attack-vector probing for cross-conn atomicity, IP spoof parity across auth surfaces, and webhook dedup edge paths — none of which were in prior runs' scope
147	-	- All previously closed Run #8 fixes verified intact (commit_upload seal, S1 tx atomicity, background.rs queue, cart MEDs)
148	-
149	-	---
150	-
151	-	# Ultra Fuzz Report — MNW Server (Run #8 — historical)
152	-
153	-	Run date: 2026-05-31
154	-	Run number: 8
155	-
156	-	## Run #8 Headline
157	-
158	-	\| Axis \| Run #5 \| Run #6 \| Run #7 \| Run #8 \| Direction \|
159	-	\|------\|--------\|--------\|--------\|--------\|-----------\|
160	-	\| Payments \| B \| B+ \| A- \| A- \| flat — H2 still deferred; 2 new MEDs surfaced (cart `min_price_cents` bypass, cart-all chain-break on all-free first seller) \|
161	-	\| Storage \| B- \| A- \| B+ \| A- \| ↑ H1 + S1 fixes verified closed; commit_upload seal intact across all 7 confirm handlers; genericization clean at every caller including synckit/blobs.rs \|
162	-	\| UX Wiring \| B \| A- \| A- \| A- \| flat — 1 new MED (item-wizard `pricing_model` silent fallback to "free" — same disease class fixed in project wizard at Run #6, not propagated) \|
163	-	\| Security \| A- \| A- \| A- \| A- \| flat — only diff in scope (username availability fail-closed) is a net improvement; MED backlog identical to Run #5/#6/#7 \|
164	-	\| Performance \| B- \| A- \| A- \| A- \| flat with 1 new SERIOUS — webhook `checkout_helpers.rs` unbounded `tokio::spawn` (send_purchase_emails / mailing_list / tip_email) competes with request handlers for the 25-slot pool under burst \|
165	-
166	-	Net Run #8: 0 CRITICAL · 1 SERIOUS new (Perf webhook spawn) — FIXED 2026-05-31 · 5 new MED — ALL FIXED 2026-05-31 · 1 SERIOUS previously-deferred (Payments H2 `claim_free_project` soft race) — FIXED 2026-05-31.
167	-
168	-	Post-Run #8 status (2026-05-31 end-of-day): 0 CRITICAL · 0 SERIOUS · 0 MED open from any prior run. All five axes A-, all above-MED items closed, all Run #8 MEDs closed, prior-deferred SERIOUS closed. Launchplan §1.5 bar fully cleared.
169	-
170	-	2026-05-31 post-Run-#8 backlog sweep (7 waves): 24 of 26 carried MED/LOW/NOTE items closed across Storage (5), Security (8), Performance (3), UX (2), Payments (2), Auth (4). Two deferred with rationale: `build_runner.rs` serial targets (LOW, builds run rarely, refactor touches denominator) and `scheduler/mod.rs` advisory-lock granularity (multi-replica concern, single-process today). New schema migration `133_items_duration_seconds_nonnegative.sql` pins the negative-duration invariant in the DB. New `commit_rescan` helper extends the chronic-disease commit_upload seal to admin paths. Tests: 1655 / 0.
171	-
172	-	Launchplan §1.5 bar: ALL 5 AXES AT A- — BAR MET. The new Perf SERIOUS is axis-internal and the agent kept Perf at A- (machinery wins outweigh; same shape as previously-closed `record_view` per-request spawn — apply mpsc + drainer pattern). New Payments MEDs and UX MED are launch-quality items worth addressing or documenting before ship; none are A- blockers.
173	-
174	-	## Run #8 — new findings above MED
175	-
176	-	### P-SERIOUS — Webhook hot-path unbounded `tokio::spawn` (Performance) — FIXED 2026-05-31
177	-	`src/routes/stripe/webhook/checkout_helpers.rs:58, 96, 124, 290` + `src/routes/stripe/webhook/checkout.rs:618`. `send_purchase_emails`, `subscribe_buyer_to_mailing_list`, `send_tip_email`, `send_guest_sale_notification`, guest-purchase-confirmation each `tokio::spawn` from the webhook handler. Multi-item cart fires N spawns per webhook; each task acquires 1-2 pool conns + a Postmark call. No JoinSet, no cap. Under burst, hundreds of detached tasks competed with request handlers for the 25-slot pool. Same shape as the Run #4 `record_view` per-request spawn (fixed via mpsc + drainer).
178	-
179	-	Fix landed: new generic `src/background.rs` module — `BackgroundTx` + `spawn_pool()` with bounded mpsc (capacity 1024) + semaphore-bounded concurrent execution (8 workers, well below `DB_POOL_MAX_CONNECTIONS=25`). `state.bg.spawn(name, fut)` is non-blocking; queue overflow logs a warning and drops the task. The `spawn_email!` macro was refactored to use the bg queue (covers 17 callers across auth/admin/follows/library/two_factor/stripe webhook/login flows). The 5 manual webhook `tokio::spawn` sites were also migrated. Per-request email sends from postmark issue replies (×2), guest-claim email, and join-wizard signup (×2) were migrated in the same pass — same disease, same fix.
180	-
181	-	Out of scope for this fix (different bug shapes; defer to Phase 4 polish or own remediation): import pipeline (long-running, needs own bound), MT community creation (single outbound HTTP, minor pool pressure), creator departure notification + status broadcast (broadcast-class — use `broadcast.rs` JoinSet pattern), idempotency-store post-response (trivial DB write), build_runner (already gated by claim flow), scheduler/monitor/scanning/page_views (background workers, not per-request).
182	-
183	-	### Payments MED — Cart `min_price_cents` bypass — FIXED 2026-05-31
184	-	Both cart paths (`process_seller_checkout` and `create_cart_checkout`) now check `pc.min_price_cents` for non-platform Discount codes before applying the discount. Cart skips the ineligible item (others may still qualify) rather than rejecting the whole cart — matches the existing scope-skip pattern.
185	-
186	-	### Payments MED — Cart-all chain-break on all-free first seller — FIXED 2026-05-31
187	-	`process_seller_checkout` signature changed `Result<String>` → `Result<Option<String>>`; all-free path now returns `Ok(None)` instead of `Err(BadRequest)`. New `drain_to_paid` helper loops through the queued sellers until a paid one is reached (returns URL) or queue exhausted (returns `Ok(None)` → library redirect). Both callers (`create_cart_checkout_all` and `checkout_success`) updated to use it.
188	-
189	-	### UX MED — Item wizard `pricing_model` silent fallback — FIXED 2026-05-31
190	-	`save_pricing` now rejects missing pricing_model with `AppError::validation("Select a pricing model")` and rejects unknown values with `format!("Unknown pricing model: {other}")`. Same shape as the project wizard Run #6 fix.
191	-
192	-	### UX MED — Inline-JS template duplication — FIXED 2026-05-31
193	-	Added delegated `data-copy-link` click handler to `static/mnw.js` with proper `.catch()` (falls back to `window.prompt` in non-secure contexts — better than the silent-no-op the inline snippets shipped with). 8 templates migrated from `onclick="navigator.clipboard.writeText(...).then(...)"` to `<a href="..." data-copy-link>Copy link</a>` (audio_player, blog_post, collection, item, project, text_reader, user, video_player). `href` is the real URL so middle-click / no-JS / share menus still work. Cache-bust query bumped to `v=0531`.
194	-
195	-	### Perf MED — Cart free-claim N+1 — FIXED 2026-05-31
196	-	Extended `CartItem` with `enable_license_keys` + `default_max_activations` (both cart queries pull them through). Three free-claim loops (single-seller paid path, discount-zeroed promo path, chain-flow path) drop the per-item `get_item_by_id` and replace per-item `remove_from_cart` DELETE with a single bulk `remove_from_cart_bulk(..., ANY($2))` at the end of each loop. Per-item tx for `claim_free_item` stays (the per-item claim-vs-already-purchased return value is load-bearing for sales-count increment). Roundtrips per free item dropped from ~5-7 to ~3-4; per-loop DELETEs from N to 1.
197	-
198	-	## Run #8 — verified standing (storage fixes from session)
199	-
200	-	- H1 (`uploads.rs::confirm_upload` L295-337) — three-arm match correct. Zero-rows arm rolls back (replace path = `try_replace_storage` swap-back with `i64::MAX` cap; fresh-upload path = `decrement_storage_used`), then `enqueue_s3_orphan(new_key)`, returns BadRequest "Item was modified concurrently." Returns BEFORE `commit_upload` and BEFORE `remove_pending_upload` — pending_uploads row left as reaper second-line defense.
201	-	- S1 (`media.rs::media_confirm` L241-293) — single `state.db.begin()` wraps storage credit + pending_uploads clear + media_files INSERT. S3 IO entirely outside tx. tx drop → Postgres ROLLBACK → all three writes reverted atomically. 23505 detection via typed `AppError::Database(sqlx::Error::Database(...))` pattern works post-rollback. S3 cleanup fires on every tx-failure branch.
202	-	- Genericization — `pending_uploads::remove_pending_upload` and `media_files::create` now `impl PgExecutor<'e>`. All 12 callers (including `synckit/blobs.rs:157`) still compile and execute correctly.
203	-	- Pool pressure delta from S1 tx — neutral-to-better. Prior code grabbed 3 separate conns serially; new code grabs 1 conn for ~3× the duration. Users-row write lock held ~ms. Per-user serialization for sub-second uploads acceptable.
204	-
205	-	## Run #8 — mandatory surprises
206	-
207	-	- Payments: `compute_splits` more careful than its comment promises — remainder-distribution loop constrained by `expected_total = amount * raw_total_pct.min(100) / 100`, so under-100% splits keep the owner's share AND distribute floor-rounding remainders up to bound. Proptest-style invariant tests fully fence it.
208	-	- Storage: `try_increment_storage_on` inside the tx holds a row-level lock on `users` for the duration of the tx. Not a bug (sub-ms hold; cap can't be over-shot via WHERE re-evaluation under READ COMMITTED). But every media confirm now serializes per-user against every other storage write.
209	-	- UX: Copy-link button is a chimera. Nine templates copy the same inline `onclick` that calls `navigator.clipboard.writeText`, mutates `this.textContent` to `"Copied!"` — silently broken in any tab loaded over plain HTTP, in iframes, or with restrictive CSP. No `.catch()` → no fallback, no error.
210	-	- Security: `routes/auth.rs:128-130` malformed-email branch skips DUMMY_HASH timing equalizer. ~2 orders of magnitude faster than every other failure path — distinguishes "you submitted an invalid-email-shaped string" from "valid email, unknown account." Real timing oracle a few lines above the equalizer that was deliberately added to prevent exactly this.
211	-	- Performance: `metrics::idempotency_middleware` does a DB SELECT on EVERY POST/PUT with an `Idempotency-Key` header BEFORE the handler runs. No bloom filter, no negative cache. ~1 extra ms per POST already doing 2-5 DB queries — free 20%+ on POST p50 available by adding an in-memory `seen` set.
212	-
213	-	## Run #8 bug counts
214	-
215	-	\| Severity \| Payments \| Storage \| UX \| Security \| Perf \| Total \|
216	-	\|---\|---\|---\|---\|---\|---\|---\|
217	-	\| CRITICAL \| — \| — \| — \| — \| — \| 0 \|
218	-	\| SERIOUS \| 1 (deferred) \| — \| — \| — \| 1 (new) \| 2 \|
219	-	\| MED \| 2 (new) \| 7 \| 5 \| 8 \| 5 \| 27 \|
220	-	\| LOW/NOTE \| 5 \| 3 \| 4 \| 3 \| 2 \| 17 \|
221	-
222	-	## Run #8 confidence per axis
223	-
224	-	- Payments HIGH (~70% LoC read)
225	-	- Storage HIGH (full)
226	-	- UX HIGH
227	-	- Security HIGH (scoped); MEDIUM for storage-route auth side-effects
228	-	- Performance HIGH
229	-
230	-	## Run #8 delta vs Run #7
231	-
232	-	- Storage B+ → A-. H1 + S1 fixes verified closed. Genericization clean.
233	-	- Payments A- flat. 2 new MEDs (cart `min_price_cents` bypass, cart-all chain-break) surfaced via expanded coverage; H2 deferred unchanged.
234	-	- UX A- flat. 1 new MED (item-wizard `pricing_model` silent fallback) — same disease class as project wizard fix from Run #6, not propagated.
235	-	- Security A- flat. Net improvement (username fail-closed). MED backlog identical.
236	-	- Performance A- flat. 1 new SERIOUS (webhook unbounded spawn) — same shape as Run #4 `record_view` fix. Cart free-flow N+1 (MED) — Run #5 fix covered paid only.
237	-
238	-	---
239	-
240	-	# Ultra Fuzz Report — MNW Server (Run #7 — historical)
241	-
242	-	Run date: 2026-05-31
243	-	Run number: 7 (+ S1 + Storage code-fuzz fixes confirmed in Run #8)
244	-
245	-	## Headline
246	-
247	-	\| Axis \| Run #5 \| Run #6 \| Run #7 \| Direction \|
248	-	\|------\|--------\|--------\|--------\|-----------\|
249	-	\| Payments \| B \| B+ \| A- \| ↑↑ Phase 2 + Run #6 + Run #7 fixes all landed; S1 cart 23505 swallow fixed post-Run #7; H2 claim_free_project soft race deferred \|
250	-	\| Storage \| B- \| A- \| B+ → A- pending Run #8 \| ↑/↓ commit_upload structural fix is excellent; Run #6 idempotency fix introduced HIGH-1 (pending_uploads leak in 4 sites) + HIGH-2 (missing rollback on update_*_url) — both fixed post-Run #7. Storage code-fuzz 2026-05-31 surfaced H1 (confirm_upload silent zero-rows + side-effects-already-fired) and reopened S1 media_confirm tx atomicity — both fixed in same session \|
251	-	\| UX Wiring \| B \| A- \| A- \| ↑ field-aware deletion + parse_dollars_to_cents shared; pricing_model silent fallback HIGH found and fixed post-Run #7 \|
252	-	\| Security \| A- \| A- \| (unchanged) \| flat — no security-touching changes in Runs #6/#7 \|
253	-	\| Performance \| B- \| A- \| (unchanged) \| flat — no perf-touching changes in Runs #6/#7 \|
254	-
255	-	## Post-Run #7 Storage code-fuzz (2026-05-31)
256	-
257	-	Targeted code-fuzz scoped to the Storage axis to verify A- before triggering full Run #8. Two findings above MED, both fixed in-session:
258	-
259	-	- H1 (HIGH) — `routes/storage/uploads.rs::confirm_upload` silent `rows_affected = 0`. Same shape as the just-closed HIGH-2 (`update__url`), one step further along the same handler family. UPDATE at L295 uses ownership-filter `WHERE id = $1 AND project_id IN (SELECT id FROM projects WHERE user_id = $4)`; `rows_affected()` was never checked. If the item was deleted between `get_item_owner` (L156) and the UPDATE, storage credit stayed incremented, `pending_uploads` got cleared a few lines down, and `commit_upload` enqueued a scan job against a ghost target — permanent S3 leak + over-charged counter. Fix:* three-arm match on the UPDATE result; zero-rows case rolls back storage and routes the new S3 key through `enqueue_s3_orphan` so the reaper still cleans it, then returns BadRequest "Item was modified concurrently."
260	-	- S1 (SERIOUS, Run #5 plan #12 reopened) — `routes/storage/media.rs::media_confirm` three-write atomicity. Run #5 called for wrapping `try_increment_storage` → `remove_pending_upload` → `media_files::create` in a transaction; Run #7's in-process compensation only covered in-process errors. Process interruption (panic, OOM kill, container restart) between any two writes still leaked. Fix: all three writes now in a single tx; tx drop rolls back storage + pending_uploads + media_files atomically. Only the S3 object needs explicit cleanup (single `delete_object` after rollback). Supporting DB-layer changes: `creator_tiers::try_increment_storage_on(&mut PgConnection)` tx-friendly variant; `pending_uploads::remove_pending_upload` and `media_files::create` signatures genericized to `impl PgExecutor<'e>` (backwards compatible).
261	-
262	-	Remaining storage MED/LOW (below launchplan §1.5 A- bar; ride into Phase 4 polish or document deferral):
263	-	- MED — `update_project_image_url` / `update_item_cover` ignore `rows_affected()` (same shape as H1; mitigated for current callers because the only follow-on side-effect is `bump_cache_generation`).
264	-	- MED — `downloads.rs:120` `((duration as u64) * 2).max(3600)` with no DB `CHECK (duration_seconds >= 0)`. Negative duration → multi-decade presigned URL. Exploitability requires creator-controlled negative duration; ffprobe doesn't produce them. Cap in code + add CHECK migration.
265	-	- MED — Admin rescan paths (`routes/admin/uploads.rs:347, 390`) call `db::scan_jobs::enqueue` directly, bypassing the `commit_upload` structural seal. Ordering is correct so no live bug; demote `db::scan_jobs::enqueue` to `pub(crate)` and expose `commit_rescan(target, ...)` to close the chronic-disease finding for real.
266	-	- MED — `enqueue_s3_orphan` single-policy doc in `routes/storage/mod.rs:24-30` overstates the discipline; many `s3.delete_object(...).await.ok()` direct calls remain at pre-storage-credit rejection paths. Tighten the doc or migrate the post-storage-credit sites.
267	-	- MED — `is_s3_key_live` doesn't enumerate project image URLs (project cover keys live in a distinct prefix so no current bug; surface is fragile if future code paths queue project image keys).
268	-	- LOW — `scanning/worker.rs:251` inline `UPDATE media_files SET scan_status` instead of `db::scanning::update_media_file_scan_status` helper.
269	-	- LOW — `routes/pages/dashboard/wizards/item/save.rs:95` `update_item_cover_image_url` updates only `cover_image_url` (not s3_key/size); client-side hidden-field abuse can desync.
270	-	- LOW — `db/pending_uploads.rs::remove_pending_upload` deletes by s3_key alone (per-handler prefix validation makes cross-user collision unreachable, but the function signature is broader than it needs to be).
271	-
272	-	Chronic disease status (5th run): The invariant-in-prose / sibling-not-swept pattern that recurred across Runs #2–#6 was structurally addressed in Run #7 via two helpers:
273	-	- `routes/storage/mod.rs::commit_upload(target: CommitTarget, ...)` — sealed `enqueue_scan_for` to module-private; the helper is now the only handler-reachable path for scan enqueue + scan_status flip after a DB write. Bug shapes 1–3 from prior runs are now structurally impossible to introduce in a new sibling.
274	-	- `crate::pricing::parse_dollars_to_cents` + `validate_dollars_f64` — canonical dollar-to-cents conversion; bypassing has historically introduced NaN→$0 and saturating-overflow silent bugs.
275	-
276	-	Net after Run #7 + S1 fix: 0 CRITICAL · 0 HIGH/SERIOUS · 1 SERIOUS deferred (Payments H2 soft race on `claim_free_project`) · a handful of MED/LOW polish items.
277	-
278	-	---
279	-
280	-	# Ultra Fuzz Report — MNW Server (Run #5 — historical)
281	-
282	-	Run date: 2026-05-30
283	-	Run number: 5
284	-
285	-	## Headline
286	-
287	-	\| Axis \| Run #4 \| Run #5 \| Direction \|
288	-	\|------\|--------\|--------\|-----------\|
289	-	\| Payments \| A- \| B \| ↓ (Run #4 plan items closed; 4 new SERIOUS surfaces previously unaudited: NULL item_id refund, splits >100% overflow, tip project authorization, cart unlisted bypass) \|
290	-	\| Storage \| A- \| B- \| ↓ (Run #4 `images.rs` ordering bug closed; same disease reappeared in `uploads.rs` route gate ordering — file-type rejection runs AFTER scan enqueue) \|
291	-	\| UX Wiring \| C+ \| B \| ↑ (Run #4 CSRF patchwork + creator-tier token fixed and structurally enforced; new CRIT: field-aware validation API is dead code at template boundary) \|
292	-	\| Security \| B+ \| A- \| ↑ (Run #4 git-shell validation, lockout email flood, CSRF policy all verified; no new CRIT/HIGH; remaining gaps are operational/MED) \|
293	-	\| Performance \| B \| B- \| ↓ (Run #4 scan_jobs retention + pool permit + broadcast bounding verified; new HIGHs in previously unaudited cart checkout + page-view paths + scheduler integrity scan) \|
294	-
295	-	Net: 3 CRITICAL (vs Run #4: 4), 13 HIGH/SERIOUS (vs Run #4: 10), 11 MED, 9 MINOR/LOW. Two axes regressed because Run #5 reached previously-unaudited territory (Payments tip/cart/refund edges; Performance hot-path request loops) while Run #4 plan items themselves were correctly closed. The Storage regression is a recurrence of the same shape in a sibling handler — the chronic invariant-in-prose disease, fourth consecutive run.
296	-
297	-	## Critical / High Findings (fix before launch)
298	-
299	-	1. [Storage — CRITICAL] `routes/storage/uploads.rs:204-237` — `confirm_upload` calls `enqueue_scan_for(...)` and `update_item_scan_status(... Pending)` BEFORE the match arm rejects `Download`/`Insertion`/`MediaImage`/`MediaVideo` with `BadRequest`. A misrouted-but-valid `item_id` confirms flips that item's scan status to Pending, blocks `stream_url` for every fan, and leaks a scan-job row for an S3 key that's then deleted.
300	-	2. [UX — CRITICAL] `error.rs:216-264` + `templates/error.html` — `AppError::validation_fields(summary, [(field, msg), ...])` is consumed only by unit tests. `ErrorTemplate` has no `fields:` member; no template renders per-field highlights. Every non-HTMX validation failure degrades to the global "Go Home / Go Back" page and wipes submitted form input. Handler authors are misled into thinking their carefully-tagged field errors reach the UI.
301	-	3. [Perf — CRITICAL] `build_runner.rs:175-180` — Partial-failure error message reports `("{}/{} succeeded", artifact_keys.len(), artifact_keys.len() + 1)`. Denominator is always `succeeded + 1`, regardless of how many targets actually ran. Three targets, one succeeded, two failed → reports "1/2" (should be 1/3). Failed-target count is never tracked.
302	-
303	-	### HIGH / SERIOUS
304	-
305	-	4. [Payments — SERIOUS] `db/transactions.rs:699-716` — `refund_transaction_by_payment_intent` returns `Vec<(TransactionId, ItemId)>` (non-Optional). Project-level transactions store `item_id IS NULL` (`routes/stripe/checkout/project.rs:135`). On `charge.refunded` for a project-level purchase, sqlx fails to decode NULL → `ItemId`; webhook handler 5xx's; Stripe retries forever.
306	-	5. [Payments — SERIOUS] `routes/stripe/webhook/checkout_helpers.rs:240-269` — `compute_splits` comment says "Defensive clamp: a misconfigured project_members row could sum past 100%" but the loop only adds remainder pennies and never subtracts. Two members at 60%+60% on $10 each are credited $6 each — $12 of $10 of revenue. Clamp only affects `expected_total`, never the already-computed per-member amounts. Tests cover ≤100% only.
307	-	6. [Payments — SERIOUS] `routes/stripe/checkout/tips.rs:104-106` — `TipForm.project_id` is taken verbatim from the form. The webhook later calls `record_tip_splits(tip.id, tip.project_id, ...)` and credits THAT project's members. An attacker tipping creator A can pass project B's UUID; B's members get split obligations credited against A's tip. Stripe money flows correctly; on-platform `tip_splits` records and any downstream reporting are corrupted.
308	-	7. [Payments — SERIOUS] `db/cart.rs:94-123` + `routes/stripe/checkout/cart.rs` — `item.rs:47-49` enforces "Unlisted items can only be obtained through their bundle" via `if !item.listed`. `toggle_cart_preflight` and `get_cart_items` check `is_public` but NOT `listed`. An attacker who knows an unlisted item's UUID can POST to `/api/cart/{id}/toggle` and check out via the cart flow, fully bypassing the bundle-only gate.
309	-	8. [Payments — SERIOUS] `routes/stripe/webhook/subscriptions.rs:117-121, 67-69, 95-96` — `status_str.parse::<SubscriptionStatus>()` returns BadRequest for any status not in `enums.rs:183-198` (Stripe's `paused` is new). Webhook handler returns Err; scheduler retries forever until status changes.
310	-	9. [Payments — SERIOUS] `payments/webhooks.rs:294-308` — `is_full_refund` returns true when `amount_refunded >= amount` and both are zero (Stripe sometimes emits these for $0 verification charges). Triggers `refund_transaction_by_payment_intent` with default `unknown` intent ID. Test at line 517-525 pins the behavior.
311	-	10. [Storage — HIGH] `routes/storage/versions.rs:159-174` — `version_confirm_upload` enqueues scan and flips `scan_status` to Pending BEFORE the `version.s3_key == req.s3_key` idempotency check at line 172. Duplicate retry of an already-confirmed upload knocks a Clean version back to Pending, breaking downloads.
312	-	11. [Storage — HIGH] `routes/storage/images.rs:179-208` — `project_image_confirm` replace branch is gated on `Ok(Some(old_size))` from `s3.object_size(&old_key)`. On `Err` (S3 hiccup) or `Ok(None)` (URL with no object behind it) it falls into the "no old image" branch, `try_increment_storage` without decrementing. Permanent storage over-count. Also: `update_project_image_url` runs AFTER `enqueue_deletions` of the old key, with no rollback path.
313	-	12. [Storage — HIGH] `routes/storage/media.rs:236-293` — `media_confirm` does three separate writes (`try_increment_storage`, `remove_pending_upload`, `media_files::create`) outside a transaction. Interruption between steps leaves S3 object orphaned with storage credit consumed and no DB row.
314	-	13. [UX — HIGH] `routes/pages/dashboard/wizards/item/save.rs:183-185, 214-227` — `let price_cents = (price_dollars * 100.0).round() as i32; if price_cents > 0 { validate_price_cents(price_cents)?; }`. Guard skips validation for 0 and negative values; value goes through `PriceCents::from_db` (no validation) into `update_item`. Submitting `price=-5` writes `-500` cents. Same pattern on PWYW: no `min <= suggested` check.
315	-	14. [UX — HIGH] `routes/pages/dashboard/wizards/item/save.rs:179-183` + `routes/api/items/bulk.rs:136-139` + `routes/pages/dashboard/wizards/project.rs:264-298` — `price_dollars: f64 = …parse()…unwrap_or(0.0)`. `"NaN".parse::<f64>()` succeeds; `NaN as i32 == 0` (silent Free). `1e20` saturates `i32::MAX`. Bulk path catches via `PriceCents::new` cap; `save.rs` does not — persists raw.
316	-	15. [UX — HIGH] `routes/auth.rs:356-361` — `let is_taken = db::users::get_user_by_username(...).await.map(\|u\| u.is_some()).unwrap_or(false);`. Transient DB error during signup live-check returns "available", misleading the user; subsequent signup races whatever real state the DB is in.
317	-	16. [Perf — HIGH] `routes/stripe/checkout/cart.rs:68-248` — Per cart item: sequential `has_purchased_item`, optional `remove_from_cart`, per-free-item `begin tx → claim_free_item → increment_sales_count → commit`, `get_item_by_id`, second `remove_from_cart`. 20-item cart ≈ 80 sequential roundtrips, ~20 separate transactions, 20 distinct pool acquisitions in series.
318	-	17. [Perf — HIGH] `db/page_views.rs:18-32` — `record_view` spawned per public request, takes a pool connection to UPSERT. With `DB_POOL_MAX_CONNECTIONS = 25`, a viral item link spawns unbounded tasks, eats the pool, times out real request handlers at acquire. No batching, no per-(target,session) debounce.
319	-	18. [Perf — HIGH] `scheduler/integrity.rs:53-73` — `check_sales_count_drift`: `SELECT i.id, i.sales_count, COUNT(t.id) FROM items LEFT JOIN transactions ... GROUP BY i.id HAVING i.sales_count != COUNT(t.id) LIMIT 50`. `HAVING` post-aggregation; Postgres scans every row in `items` and joins every completed transaction in history before filtering. `LIMIT 50` doesn't cap the work. Weekly multi-minute query holding a pool connection.
320	-
321	-	## Scorecard
322	-
323	-	### Axis Summary Grades
324	-
325	-	\| Axis \| Overall \| Cold Spots \| Mandatory Surprise \|
326	-	\|------\|---------\|------------\|--------------------\|
327	-	\| Payments \| B \| `routes/stripe/checkout/cart.rs` (B-), `routes/stripe/checkout/tips.rs` (B-), `db/transactions.rs` (B-), `routes/stripe/webhook/checkout_helpers.rs` (B-), `routes/stripe/webhook/subscriptions.rs` (B) \| `compute_splits` carries a "Defensive clamp" comment that explicitly anticipates the >100% case and then fails to defend against it — only `expected_total` is clamped, the already-computed per-member splits go unchanged. Treat as evidence the defensive-comment culture is itself unreliable; comments and code drift independently. \|
328	-	\| Storage \| B- \| `routes/storage/uploads.rs` (C+), `routes/storage/images.rs` (C+), `routes/storage/versions.rs` (C+), `routes/storage/media.rs` (B-), `db/mod.rs::check_sandbox_cap` (C+) \| `stream_url` (`downloads.rs:119-122`) computes presigned expiry as `((duration as u64) * 2).max(3600)` where `duration: i32` and no DB CHECK ≥ 0 exists on `duration_seconds`. A negative value becomes near-`u64::MAX` expiry — a centuries-long presigned URL. The cast width and missing CHECK are independent latent bugs that compose into a multi-decade credential leak. \|
329	-	\| UX Wiring \| B \| `routes/pages/dashboard/wizards/item/save.rs` (B-), `error.rs` (B-), `routes/pages/public/discover.rs` (B) \| `update_item` takes ~13 positional `Option`s; call sites are unreadable and error-prone. The negative-price bug (HIGH #13) is born from this signature: anyone calling it has no compiler help distinguishing `Some(-500)` (bug) from `Some(500)` (intent). \|
330	-	\| Security \| A- \| `helpers.rs` (B+), `scanning/clamav.rs` (B), `scanning/yara.rs` (B), `rate_limit.rs` (B+) \| The "11 layer" scan pipeline test gives a false sense of coverage. ClamAV is `FailOpen` by explicit policy (`scanning/clamav.rs:19`), YARA silently skips rule files that fail to compile (`scanning/yara.rs:54-67`), and there is no startup assertion that any real AV layer is live. A misconfigured deploy can pass EICAR as Clean while the test suite is green. \|
331	-	\| Performance \| B- \| `routes/stripe/checkout/cart.rs` (C), `scheduler/announcements.rs` (C+), `scheduler/integrity.rs` (C+), `scheduler/cleanup.rs` (B-), `build_runner.rs` (B-), `db/page_views.rs` (C+), `db/pending_s3_deletions.rs` (B) \| The biggest scaling cliff is a 1-line `tokio::spawn` on the page-view path, not anything that "looks expensive". Hot-path response shipped its tail-latency problem to the same pool that serves it. \|
332	-
333	-	## Bug Counts by Severity
334	-
335	-	\| Severity \| Payments \| Storage \| UX \| Security \| Perf \| Total \|
336	-	\|---\|---\|---\|---\|---\|---\|---\|
337	-	\| CRITICAL \| — \| 1 \| 1 \| — \| 1 \| 3 \|
338	-	\| HIGH/SERIOUS \| 5 \| 3 \| 3 \| — \| 3 \| 14 \|
339	-	\| MED \| 2 \| 3 \| 2 \| 4 \| 2 \| 13 \|
340	-	\| MINOR/LOW \| 2 \| 2 \| 2 \| 3 \| 1 \| 10 \|
341	-
342	-	## Cross-Cutting Concerns
343	-
344	-	1. Side-effects-before-validation pattern. Storage (uploads/versions/images route gates run after scan enqueue), Payments (tip `project_id` accepted before authorization, cart `listed` not checked before checkout), UX (price `from_db` after a guard that skips zero/negative). Four files, three axes, same shape: persist first, validate later.
345	-	2. Invariant-in-prose, fourth consecutive run. Run #2→#3 was MaybeUser; Run #3→#4 was scan_status ordering comments-vs-code; Run #4 partial fix landed (`images.rs`) but the same disease moved up a layer to `uploads.rs` (the route-level file-type gate now runs after scan enqueue). The Payments "defensive clamp" comment in `compute_splits` is the same shape on a different organ. No type-level constructive impossibility has yet been applied to any of these.
346	-	3. Optional positional args as bug carriers. `update_item`'s ~13 positional `Option`s let the wizard pass a negative-price `Option<PriceCents::from_db>` past the validator. Same pattern is implicated in the UX field-error finding — `ErrorTemplate`'s struct literal is missing a `fields:` field at every callsite and the compiler doesn't care.
347	-	4. Hot-path pool pressure from fire-and-forget writes. `record_view` per pageview, `tokio::spawn` per cart line, scheduler advisory-lock conn pinned across S3. The 25-connection pool is sized for a quiet box; three independent fan-out patterns can each saturate it.
348	-	5. FailOpen with no liveness assertion. ClamAV FailOpen + YARA optional + no startup gate = a green test suite can coexist with zero real AV coverage. Same shape as the Performance "spawned task accumulates without bound" pattern — both are silent degradations the operator never sees.
349	-
350	-	## Components Successfully Stress-Tested
351	-
352	-	- All Run #4 Phase 1 closures verified standing (CSRF creator-tier token, `images.rs` scan_status ordering structural fix, git-shell validation, lockout `=` predicate, promo dedupe, scanner streaming + pool permit, broadcast bounded fan-out, scan_jobs retention).
353	-	- Stripe HMAC: multi-secret `v1=` rotation now accepts on any match (Run #4 polish landed).
354	-	- Promo `try_increment_use_count` race-free via atomic single-row UPDATE; release path uses detach for no-double-decrement; proptest-covered.
355	-	- License keys: 66-bit entropy, DB UNIQUE, `FOR UPDATE` activation, full recount on revoke (display lag only — finding #M).
356	-	- CSRF posture: `CsrfRouter<S>` newtype prevents a bare `Router::route(path, post(...))` from compiling in mutation-bearing files. Verified.
357	-	- Argon2id parameters + `DUMMY_HASH` timing equalization on user-not-found (login, OAuth, SyncKit).
358	-	- PKCE-S256 pinned at both authorize and token endpoints; OAuth code atomic single-use consume.
359	-	- JWT future-iat rejection + `jwt_invalidated_at` second-equal `<=` semantics; password change bumps `jwt_invalidated_at` via `update_user_password`.
360	-	- SSE shard-guard drop-before-remove; cross-process advisory locks for scheduler ticks.
361	-	- ZIP bomb: decompressed-bytes counted (not claimed); ratio + depth caps; nested magic-byte detection.
362	-	- `try_increment_storage` cap-predicate UPDATE; concurrent uploads cannot both squeeze past cap.
363	-
364	-	## Confidence Per Axis
365	-
366	-	- Payments HIGH — read 22 of 23 listed files end-to-end with targeted attacks per surface; all four SERIOUS reproducible by line-tracing.
367	-	- Storage HIGH — CRITICAL and all three HIGHs mechanically reproducible; mandatory surprise composes two latent bugs via line-by-line read.
368	-	- UX Wiring HIGH — full read of `csrf.rs`, `error.rs`, `markdown.rs`, `formatting.rs`, `validation/mod.rs`; spot-checked 20+ templates for CSRF pattern; CRITICAL field-aware-validation finding cross-checked by grepping `validation_fields_ref` callers.
369	-	- Security MEDIUM — auth/CSRF/OAuth/scanning surfaces walked thoroughly; admin/moderation/reports/ssh_keys API/totp routes only sampled. ClamAV FailOpen is policy not bug; flagged as architectural risk.
370	-	- Performance MEDIUM-HIGH — spot-checked DB call patterns across 15+ files; exhaustive route-level N+1 sweep deferred; stripe/webhook code shows similar `for x in &xs` loops at `checkout.rs:149,167,198,452` that were not deep-audited.
371	-
372	-	## Metrics
373	-
374	-	- Modules audited: ~80
375	-	- Cold spots (≤ B): 18
376	-	- Bugs: 3 CRITICAL, 14 HIGH/SERIOUS, 13 MED, 10 MINOR/LOW
377	-	- Axes at A- or above: 1/5 (Security)
378	-
379	-	## Delta Since Run #4
380	-
381	-	FIXED (Run #4 items not surfaced this run):
382	-	- All 10 Run #4 Phase 1 items verified closed (CSRF creator-tier, `images.rs` ordering, git-shell validation, lockout email flood, cancel_pending CSRF, promo dedupe, scanner streaming + pool permit, scan_jobs retention, broadcast bounding).
383	-	- All 7 Run #4 Phase 2 items verified closed (cart template price math, media reupload race, pending_uploads reaper bump, TOTP step-replay, delete_other_sessions cache eviction, `/login` CSRF, OAuth fetch_optional).
384	-	- All 5 Run #4 Phase 3 items verified closed (claim_pending_build partial index, build status reaper race, `extract_s3_key_from_url` host pinning, TOTP `pending_2fa` tracking row, KNOWN_SYNC_APPS removed entirely).
385	-	- All Phase 4 polish items verified closed.
386	-
387	-	NEW CRITICAL/HIGH in Run #5 (previously unaudited or regressed):
388	-	- Storage: `uploads.rs` route-level file-type gate runs after scan enqueue (CRIT).
389	-	- UX: `validation_fields` plumbing is dead code at template boundary (CRIT).
390	-	- Perf: `build_runner.rs` partial-failure denominator nonsense (CRIT).
391	-	- Payments: NULL `item_id` decode bomb on project-level refunds (SERIOUS).
392	-	- Payments: `compute_splits` over-credits when project_members sum >100% (SERIOUS).
393	-	- Payments: tip `project_id` not validated vs recipient (SERIOUS).
394	-	- Payments: cart bypasses item `listed` gate (SERIOUS).
395	-	- Payments: unknown subscription status retry storm (SERIOUS).
396	-	- Storage: `version_confirm_upload` scan enqueue before idempotency check (HIGH).
397	-	- Storage: `project_image_confirm` mis-accounts on S3 probe failure + no rollback (HIGH).
398	-	- Storage: `media_confirm` non-atomic three-write sequence (HIGH).
399	-	- UX: negative/NaN price acceptance via `PriceCents::from_db` after permissive guard (HIGH).
400	-	- UX: username availability check fails open on DB error (HIGH).
401	-	- Perf: cart checkout 80 sequential roundtrips (HIGH).
402	-	- Perf: `record_view` unbounded spawn per public request (HIGH).
403	-	- Perf: `check_sales_count_drift` full-table aggregate (HIGH).
404	-
405	-	CHRONIC (across Run #3 → Run #4 → Run #5):
406	-	- Invariant-in-prose / policy-not-in-types — FOURTH consecutive run. Run #4 partially fixed the scan_status ordering inside `images.rs` (and the CSRF policy via `CsrfRouter` structurally), but the same disease moved up a layer: in `uploads.rs` the route-level file-type gate now runs after scan enqueue. The constructive-impossibility shape needed: extract a `commit_upload(file_type, ...)` higher-level operation that validates the file_type before doing any scan/credit side effects, then make `enqueue_scan_for` + `update_*_scan_status` `pub(crate)` so handlers cannot call them directly. The Payments `compute_splits` "Defensive clamp" comment + the UX `validation_fields_ref` orphan plumbing are the same disease in different organs.
407	-
408	-	REGRESSED:
409	-	- Payments (A- → B) — four new SERIOUS bugs surfaced in previously-unaudited tip/cart/refund/subscription-status corners. Not a regression in fixed code; a regression in audit coverage.
410	-	- Storage (A- → B-) — invariant-in-prose recurrence (chronic above).
411	-	- Performance (B → B-) — hot-path request loops audited for the first time.
412	-
413	-	---
414	-
415	-	# Plan: Restore Every Axis to A- or Higher (Run #5)
416	-
417	-	Target grades: Payments A · Storage A · UX A- · Security A- · Performance A-.
418	-
419	-	User priority for the launch window: resolve every CRITICAL/SERIOUS/HIGH before re-running. Iterate until audits surface only small new errors.
420	-
421	-	## Phase 1 — CRITICAL (fix today)
422	-
423	-	1. Storage CRIT — `uploads.rs` file-type gate ordering. `routes/storage/uploads.rs:204-237`. Move the match arm that rejects `Download`/`Insertion`/`MediaImage`/`MediaVideo` BEFORE `enqueue_scan_for` and `update_item_scan_status`. Then make `enqueue_scan_for` + `update_*_scan_status` `pub(crate)` and expose a `commit_upload(file_type, item_id, s3_key)` higher-level op that performs validation → credit → row insert → status flip in the correct order. The same constructor must serve `versions.rs` and `images.rs`. This closes the chronic invariant-in-prose finding.
424	-	2. UX CRIT — Field-aware validation reaches the UI. `error.rs:216-264` + `templates/error.html` + `templates/partials/form_errors.html` (new). Either (a) add `fields: Vec<(String, String)>` to `ErrorTemplate` and a `{% for f in fields %}` block in `error.html` + per-input markup; or (b) delete `validation_fields*` API entirely and replace handler callsites with `validation(summary)`. Choose (a) for non-HTMX forms that need to preserve user input; choose (b) only if every existing callsite is HTMX-only and uses OOB swaps for inline errors. Audit all `validation_fields` callers and pick a path.
425	-	3. Perf CRIT — `build_runner.rs` partial-failure denominator. `build_runner.rs:175-180`. Track `failed_count` alongside `artifact_keys`; report `succeeded/(succeeded+failed)`. Add a test that runs 3 targets with 2 failures and asserts "1/3" in the error string.
426	-
427	-	## Phase 2 — SERIOUS / HIGH (fix this weekend)
428	-
429	-	4. Payments SERIOUS — NULL item_id refund decode. `db/transactions.rs:699-716`. Change return to `Vec<(TransactionId, Option<ItemId>)>`; `refund_transaction_by_payment_intent` caller skips `decrement_sales_count`/`revoke_keys_by_transaction` when `item_id is None`. Add a fixture-based test against a project-level transaction.
430	-	5. Payments SERIOUS — `compute_splits` over-credit. `routes/stripe/webhook/checkout_helpers.rs:240-269`. Reject `total_split_pct > 100` at the project_members write site (DB CHECK or validation). Defensively, scale each split proportionally when sum > 100, OR clamp each split against remaining `expected_total` budget in the loop. Add a test at 60%+60%.
431	-	6. Payments SERIOUS — Tip project authorization. `routes/stripe/checkout/tips.rs:104-106`. After accepting `TipForm`, fetch the project and assert `project.user_id == recipient_id`; return 400 otherwise.
432	-	7. Payments SERIOUS — Cart bypasses `listed` gate. `db/cart.rs:94-123` and `get_cart_items`/`get_cart_items_for_seller`. Add `AND i.listed = true` to all three queries. Add a check in the per-seller checkout path. Add a regression test that toggles an unlisted item into the cart and asserts rejection.
433	-	8. Payments SERIOUS — Unknown subscription status. `routes/stripe/webhook/subscriptions.rs:117-121`. Replace `?` with a match: known statuses dispatch; unknown statuses `tracing::warn!` and return `StatusCode::OK` so Stripe stops retrying.
434	-	9. Payments SERIOUS — `is_full_refund` zero-amount. `payments/webhooks.rs:294-308`. Predicate becomes `amount > 0 && amount_refunded >= amount`. Update the test at line 517-525 to invert (zero-amount must NOT be treated as full refund).
435	-	10. Storage HIGH — `versions.rs` enqueue-before-idempotency. `routes/storage/versions.rs:159-174`. Move idempotency `version.s3_key == req.s3_key` check BEFORE `enqueue_scan_for`. Apply the Phase 1 `commit_upload` helper here.
436	-	11. Storage HIGH — `project_image_confirm` probe-failure + no rollback. `routes/storage/images.rs:179-208`. (a) On `Err` or `Ok(None)` from `s3.object_size`, fall back to the row's recorded size (add a `project_image_bytes` column if not present) rather than the "no old image" branch. (b) Move `enqueue_deletions` to AFTER `update_project_image_url` success, or wrap both in a tx with the enqueue inside.
437	-	12. Storage HIGH — `media_confirm` non-atomic three-write. `routes/storage/media.rs:236-293`. Wrap `try_increment_storage` → `remove_pending_upload` → `media_files::create` in a transaction. The storage credit refund must fire on any failure path.
438	-	13. UX HIGH — Negative/NaN prices via `from_db`. `routes/pages/dashboard/wizards/item/save.rs:183-185, 214-227`. Use `PriceCents::new(price_cents)?` unconditionally; drop the `> 0` guard. Add `min <= suggested` check on PWYW.
439	-	14. UX HIGH — f64 price parsing accepts NaN. Same file + `routes/api/items/bulk.rs:136-139` + `routes/pages/dashboard/wizards/project.rs:264-298`. Parse as decimal cents directly (or `Decimal::from_str_exact` from the `rust_decimal` crate already in `Cargo.lock`); reject NaN/Inf; reject negative/saturating values before cast.
440	-	15. UX HIGH — Username live-check fails open. `routes/auth.rs:356-361`. Propagate the DB error or treat it as "unavailable, try again" — never "available" by default.
441	-	16. Perf HIGH — Cart checkout sequential roundtrips. `routes/stripe/checkout/cart.rs:68-248`. Bulk-load `has_purchased_item` once with `WHERE item_id = ANY($1)`. Batch `get_item_by_id` lookups. Claim free items in a single transaction with batched inserts. Aim for ≤ 5 roundtrips for any cart size.
442	-	17. Perf HIGH — `record_view` unbounded spawn. `db/page_views.rs:18-32`. Replace per-request spawn with an `mpsc` channel; one background task drains every 250ms and flushes one bulk `INSERT … ON CONFLICT … DO UPDATE SET view_count = page_view_daily.view_count + EXCLUDED.view_count`.
443	-	18. Perf HIGH — Sales drift full-table aggregate. `scheduler/integrity.rs:53-73`. Maintain trigger-updated `transactions_completed_count` per item, or run the check off-pool against a snapshot. Short term: add `WHERE i.sales_count > 0 OR EXISTS (SELECT 1 FROM transactions WHERE item_id = i.id LIMIT 1)` to drop the LEFT JOIN's all-zero rows from the aggregate.
444	-
445	-	## Phase 3 — MED (fix before re-run if cheap)
446	-
447	-	- Storage: advisory-lock leak in `check_sandbox_cap` (`db/mod.rs:92-128`) → `pg_advisory_xact_lock` or RAII guard.
448	-	- Storage: `is_s3_key_live` missing tables (`db/pending_s3_deletions.rs:67-82`) → audit all s3_key-bearing columns; consider normalized `s3_objects` table.
449	-	- Storage: `delete_version` owner SELECT outside tx + post-commit S3 enqueue (`db/versions.rs:267-315`) → owner SELECT inside tx; enqueue inside tx.
450	-	- Security: ClamAV `FailOpen` startup assertion (`scanning/clamav.rs:19` + `scanning/mod.rs:151-164`) → refuse boot if scan configured but no AV layer live; emit `tracing::error!` after N consecutive ClamAV errors.
451	-	- Security: `helpers.rs:44-50` `DefaultHasher` for advisory lock keys → stable hasher (`sha2` first 8 bytes, or `xxh3` with constant seed).
452	-	- Security: OAuth `state` size cap (`routes/oauth.rs:379-386`) → reject `form.state.len() > 1024`; cap `code_challenge` at 44 base64url chars.
453	-	- Security: `extract_client_ip` non-Cloudflare fallback warning (`helpers.rs:33-40`) → emit one-shot `tracing::warn!` at startup if no `CF-Connecting-IP` seen after N requests.
454	-	- UX: pagination offset overflow (`routes/pages/public/discover.rs:85-87`, `routes/admin/users.rs:37-39`) → clamp `page` to `total_pages.max(1)` before arithmetic.
455	-	- UX: forms render without `_csrf` when handler forgets to populate `csrf_token` → make `csrf_token` non-optional in form-bearing templates (compile-time error) or render an inline "refresh and try again" notice.
456	-	- UX: `validate_username` byte-length check (`routes/auth.rs:322`) → `chars().count()`, or reorder ASCII filter before length.
457	-	- Perf: scheduler advisory-lock connection pinned across S3 (`scheduler/mod.rs:92-279`) → dedicated `PgPoolOptions::new().max_connections(1)` outside the main pool.
458	-	- Perf: cleanup S3 deletes serialized inside scheduler tick (`scheduler/cleanup.rs:77-100`) → `for_each_concurrent(8, ...)`; better, move user-deletion off the scheduler tick.
459	-
460	-	## Phase 4 — Polish (after re-run shows axes ≥ A-)
461	-
462	-	- Payments: `has_active_subscription_to_item` period-end clause mirroring (`db/subscriptions.rs:464-470`).
463	-	- Payments: `get_active_creator_tier` + `sync_user_creator_tier` period-end defense (`db/creator_tiers.rs:91-103, 181-194`).
464	-	- Payments: `release_use_count` race messaging (`db/promo_codes.rs:184-200`).
465	-	- Payments: License key `activation_count` recount on revoke (`db/license_keys.rs:343-382`).
466	-	- Payments: Subscription minimum-charge check (`payments/checkout.rs:283-317`).
467	-	- Payments: Webhook v1/v2 unmark-on-failure parity (`routes/stripe/webhook/mod.rs:48-86`).
468	-	- Storage: `media_files.list_folders` scan filter (`db/media_files.rs:73-82`).
469	-	- Storage: `pending_uploads.record_pending_upload` silent user-mismatch (`db/pending_uploads.rs:23-33`).
470	-	- Storage: `append_log_bounded` non-atomic size cap (`build_runner.rs:516-534`).
471	-	- Storage: `downloads.rs:119-122` presigned-URL expiry: cap `duration_seconds` at i64 + add DB CHECK ≥ 0.
472	-	- Security: `validate_token_consuming` for OAuth POST (`routes/oauth.rs:206`).
473	-	- Security: `parse_repo_path` rejects lone-dot entries (`git_ssh.rs:162`).
474	-	- Security: ClamAV INSTREAM 16K cap → treat truncation as fail-closed (`scanning/clamav.rs:101-108`).
475	-	- UX: validation error messages stop reflecting user input (`wizards/item/mod.rs:176-179`).
476	-	- UX: CSRF body extraction stops using `from_utf8_lossy` (`csrf.rs:528-543`).
477	-	- Perf: scan-pipeline 400 MiB worst-case capacity-plan note (`constants.rs:156-157`).
478	-	- Perf: announcement fan-out persistence + resume (`scheduler/announcements.rs:59-89, 147-177`).
479	-	- Perf: build log per-line DB roundtrip (`build_runner.rs:516-534`) → in-process running total.
480	-
481	-	## Phase 5 — Chronic (must land in Run #6 or this audit cycle has failed)
482	-
483	-	Invariant-in-prose / policy-not-in-types, fourth consecutive run. The Phase 1 #1 fix (constructive `commit_upload` helper sealing the lower-level ops) is the only acceptable resolution. Memory notes, comments warning future authors, and renamed-helper approaches have been tried in three prior runs and recurred each time. After Phase 1 lands, audit `compute_splits` and `ErrorTemplate` for the same shape and apply the same treatment.
484	-
485	-	---
486	-
487	-
488	-
489	-	## Headline
490	-
491	-	\| Axis \| Run #3 \| Run #4 \| Direction \|
492	-	\|------\|--------\|--------\|-----------\|
493	-	\| Payments \| A- \| A- \| flat (1 new SERIOUS: promo over-release on cart cleanup) \|
494	-	\| Storage \| B+ \| A- \| ↑ (Run #3 image-confirm rollback/race-guard fixes verified; one residual CRIT in same file) \|
495	-	\| UX Wiring \| B+ \| C+ \| ↓ (CSRF policy patchwork: missing tokens + undocumented mutation in exempt prefix) \|
496	-	\| Security \| B+ \| B+ \| flat (different HIGHs: git-shell repo-name validation + lockout DoS) \|
497	-	\| Performance \| B- \| B \| ↑ (Run #3 sync-FS-in-async + DashMap shard-lock + monitor split all verified; new unbounded scan_jobs/broadcast/pool-permit findings) \|
498	-
499	-	Net: 4 CRITICALs (vs Run #3: 2), 10 HIGH/SERIOUS (vs Run #3: 10), 22 MED, 23 MINOR/LOW. Ship-blockers are concentrated in two structural rots — CSRF policy and scan_jobs growth — not in net-new logic mistakes.
500	-

Lines truncated

D server/todo.md -373

		@@ -1,373 +0,0 @@
1	-	# MNW Server — Todo
2	-
3	-	Last updated: 2026-05-31 late evening (post Run #9 — launch-eve pass).
4	-
5	-	## Status
6	-
7	-	All 5 axes at A- after Run #9 fixes. 0 CRITICAL open · 1 SERIOUS open (deferred) · 3 HIGH open (deferred) · 7 MED open (deferred). Launchplan §1.5 A- bar holds. See `docs/audit_review.md` Run #9 section for full triage.
8	-
9	-	## Run #9 — fixed this session (2026-05-31)
10	-
11	-	- UX-CRITICAL Signup TOCTOU 23505 → 500 + form loss. `join_wizard.rs`: catch 23505 with constraint-name routing, surface as `return_error`. Follow-up: preserve typed form fields on error swap (Phase 4).
12	-	- Sec-SERIOUS `delete_all_sessions_for_user` non-atomic JWT bump → wrapped in `pool.begin()` / `tx.commit()` (`db/sessions.rs:247`).
13	-	- Sec-SERIOUS 2FA login-email IP spoofable via bare `x-forwarded-for` → swapped to `crate::helpers::extract_client_ip` (`routes/pages/public/two_factor.rs:308`).
14	-	- Pay-SERIOUS Webhook dual-failure 503 short-circuited on Stripe retry → call `unmark_event_processed` before returning 503 (`routes/stripe/webhook/mod.rs:81`).
15	-
16	-	`cargo check --tests` clean; targeted unit tests (sessions/webhook/two_factor/join_wizard) 33/33 green. Full DB-integration suite needs astra postgres.
17	-
18	-	## Run #9 — deferred with rationale (Phase 4)
19	-
20	-	- [ ] Pay-SERIOUS Subscription webhook out-of-order events resurrect `active`. Needs `created`-timestamp re-extraction from `UntypedEvent` + `WHERE last_event_at <= $created` guards across Fan+/creator-tier/synckit subscription writes. Cross-cutting; worst case is minutes-window of restored access until next webhook.
21	-	- [ ] Sto-HIGH Migration 129 dead-letter table never written (`cleanup.rs:453`). Operational visibility, not runtime; one-INSERT fix.
22	-	- [ ] Perf-HIGH Per-request `reqwest::Client::new()` in 5 hot paths (dashboard/main, public/landing, api/internal/cli_features, api/domains, auth.rs). Hoist to OnceLock or AppState pooled client.
23	-	- [ ] Perf-HIGH Unbounded `tokio::spawn` in `cleanup.rs:215-220` `spawn_expired_account_cleanups`. Lift existing `CLEANUP_PARALLELISM=4` JoinSet pattern from `cleanup_sandbox_accounts` 100 lines above.
24	-	- [ ] Pay-MED `pricing.rs::parse_dollars_to_cents` strips European decimal comma; `1,23` → 12300¢.
25	-	- [ ] Pay-MED SyncKit app-sub checkout silently defaults `storage_limit_bytes` to 0 if metadata missing.
26	-	- [ ] Pay-MED Guest checkout email sentinel `"unknown@guest"` collision risk.
27	-	- [ ] Sto-MED `is_s3_key_live` 7 EXISTS subqueries on unindexed s3_key columns — sequential scans per retry. Add partial indexes WHERE NOT NULL.
28	-	- [ ] Sto-MED `is_s3_key_live` LIKE suffix `'%' \|\| s3_key` false-positives on neighboring keys → S3 object leaks. Anchor with `/`.
29	-	- [ ] UX-MED `purchase.html:145` `?return_to=` dead-wired; login handler always redirects `/dashboard`.
30	-	- [ ] UX-MED Admin user filter buttons (`admin-users.html:35-44`) use `class="primary"` instead of `btn-primary` — renders unstyled.
31	-
32	-	## Run #9 — LOW/NOTE (carry forward)
33	-
34	-	- [ ] UX-LOW Pagination links in `git/issues.html:72,76` don't URL-encode `search` param.
35	-	- [ ] UX-LOW 5 sites use `.render().unwrap_or_default()` on Askama templates — blank UI on render failure, no log line.
36	-	- [ ] UX-LOW `slugify` (`formatting.rs:85`) produces `"post"` for any non-ASCII title.
37	-	- [ ] Sec-MINOR `csrf.rs:176-185` `validate_token_consuming` doesn't actually consume — rename or rotate.
38	-	- [ ] Sec-MINOR `routes/oauth.rs:101-111` `is_localhost_redirect` allows any port regardless of registered URI.
39	-	- [ ] Sec-MINOR `scanning/archive.rs:124` path-traversal check misses lone `..` segment (no trailing `/`).
40	-	- [ ] Perf-LOW `db/page_views.rs` `pending` HashMap has no max-cardinality cap.
41	-	- [ ] Perf-LOW `build_runner.rs:441` artifact tmpfile leaks if process crashes between SCP and `remove_file`.
42	-
43	-	Live state: working tree has 104+ Run #8 files plus 4 Run #9 files (`join_wizard.rs`, `sessions.rs`, `two_factor.rs`, `webhook/mod.rs`, `docs/audit_review.md`, `todo.md`).
44	-
45	-	## Open before launch (Monday 2026-06-01)
46	-
47	-	### Platform-as-product audits (skill-driven, code-review scope; fresh context recommended)
48	-	- [ ] `/creator-fuzz` — would a working creator trust this with their livelihood?
49	-	- [ ] `/use-fuzz` — discoverability, learnability, first-five-minutes
50	-	- [ ] `/business-fuzz` — pricing copy, fee surfacing, refund-policy wording vs actual platform behaviour
51	-
52	-	### Per-project hygiene (manual, my call when ready)
53	-	- [ ] README first-screen audit — what is this / who is it for / where to get it / what does it cost. No headliner paragraphs.
54	-	- [ ] `Cargo.toml` version bump for the launch deploy (pick the number; I do the edit if needed)
55	-	- [ ] CHANGELOG entry for the launch version
56	-
57	-	### Monday browser/prod testing (saved for Monday per current direction)
58	-	- [ ] §1.1 Walk every public page: footer present, OG/Twitter meta render correctly in Facebook + Twitter debuggers, error pages render via forced 404/403/500
59	-	- [ ] §1.2 First-run creator flow end-to-end in production: signup → Stripe Connect → first item upload
60	-	- [ ] §1.3 Each seeded creator's `/{handle}` page renders without empty sections; sample item per medium (audio/video/text/download) reachable from `/discover`
61	-	- [ ] §1.4 Production deploy of post-fuzz build + version recorded via `record_deploy`; scheduled jobs running on prod (cleanup, scan jobs retention, build reaper, broadcast fan-out); Stripe webhook reachable from dashboard ping; backup snapshot taken pre-launch + restoration path documented in `_private/docs/mnw/server-docs/`; `/health` green
62	-	- [ ] §5 launch-day sequence: final deploy, smoke-test logged-out from non-dev machine, update bios/link-in-bio/handles, confirm `maxj.phd` resolves, tag launch commit (`git tag launch-2026-06-01`)
63	-
64	-	## Open question for the user (action before Monday)
65	-
66	-	- [ ] Confirm all role-based email addresses route to real mailboxes: `info@`, `security@`, `dmca@`, `privacy@`, `dpo@`, `legal@`, `billing@`, `policy@`, `reports@`, `community@`, `appeals@`, `press@`, `noreply@`. Legal pages (terms, privacy, copyright, appeals) and several role-routed flows reference them. If any are aspirational, that's a launch risk for the legal pages and an inbound-mail blackhole. Verify with Postmark/forwarding setup.
67	-
68	-	## Deferred with rationale (no action; documented)
69	-
70	-	- [ ] `build_runner.rs:151` serial-target loop. LOW; builds run rarely; refactor touches denominator + error aggregation + log order. Post-launch.
71	-	- [ ] `scheduler/mod.rs:92-279` advisory-lock per-tier granularity. Multi-replica concern; defer until multi-replica is real.
72	-	- [ ] Drop unused `completion_effects` table (migration cleanup, schema-only).
73	-	- [ ] Templatize founder + standard annual prices in `tiers.md` and `pricing.md` (e.g. `$86/yr`, `$130/yr`, `$194/yr`, `$324/yr`; standard `$173/$259/$389/$648`). docengine substitutions don't support arithmetic; would require adding derived `tiers.founding.basic_annual` etc keys in `shared/docengine/src/assumptions.rs`. Not blocking.
74	-	- [ ] `_head_assets.html` apple-touch-icon + manifest link wiring. `static/manifest.json` exists but the `<link rel="manifest">` was reverted; bring back if/when desired.
75	-	- [ ] Migrate footer's `What's new` and `Shortcuts` `<a href="#" onclick="...">` to `data-*` attributes following the `data-copy-link` pattern. UX MED, not blocking.
76	-
77	-	## What's done this session (compact summary, full details below)
78	-
79	-	- Ultra Fuzz Run #8 — all 5 axes A-. SERIOUS webhook unbounded spawn closed via new `src/background.rs` (bounded mpsc + semaphore-bounded concurrent execution). `spawn_email!` macro migrated; 17 callers + 5 manual webhook spawns + 5 same-disease per-request email spawns now route through bg queue. Run #8 5 new MEDs all closed (cart `min_price_cents`, cart-all chain-break, item-wizard `pricing_model`, inline-JS templates, cart free-claim N+1). Previously-deferred Payments H2 `claim_free_project` race closed.
80	-	- 7-wave backlog sweep — 24 of 26 carried items across auth/security/scanning/db/storage/UX/perf/payments. New schema migration `133_items_duration_seconds_nonnegative.sql`. New `commit_rescan` helper extends chronic-disease seal to admin paths. Two LOW items deferred above.
81	-	- 4 cross-cutting sweeps — `info@makenot.work` email pin (8 files), localhost/TODO/emoji/secret scans all clean.
82	-	- §1.1 public-surface code work — OG + Twitter card meta in `base.html` (per-page overridable blocks), `static/manifest.json` created with brand colours, `error.html` drops broken back button + adds contact link, `Contact` link added to footer (mailto:info@), new `routes/pages/public/sitemap.rs` (with in-memory 10-min cache + LIKE-wildcard escape from the security review).
83	-	- Doc-fuzz — `content-scanning.md` restructured (Malware checks + Authenticity checks sections, added URLhaus/MetaDefender/signing layers), `policy.html` See-also block linking 6 legal pages, `tiers.md` prose prices templatized via `{{ tiers.standard.* \| int }}`.
84	-	- Exorcise — 9 AI-tell removals across compare.md, content-scanning.md, appeals.md, faq.md.
85	-	- Nitpick — 2 polish edits (dead `let _ = scan_status` removed, unused tuple-name destructure tidied).
86	-	- Security review — 2 MEDs fixed inline: sitemap.xml in-memory cache to absorb crawler/attacker hammering; LIKE-wildcard escape on `is_s3_key_live` to prevent `_` in s3_keys from false-positive matching.
87	-
88	-	---
89	-
90	-	## Ultra Fuzz 2026-05-31 (Run #8 — final re-grade)
91	-
92	-	### Above-MED items to address before launch (or defer with rationale)
93	-
94	-	### New MED-tier findings (all closed 2026-05-31)
95	-
96	-	All 5 MEDs landed. `cargo test --lib` 1654 / 0.
97	-
98	-	### Verified closed this run
99	-
100	-	### Storage A- standing — remaining MED/LOW (Phase 4 polish or defer)
101	-	Carried from Storage code-fuzz 2026-05-31 — see below. All still MED, none A- blockers.
102	-
103	-	---
104	-
105	-	## Audit backlog sweep 2026-05-31 (post-Run #8, 7 waves)
106	-
107	-	Sorted by file locality and difficulty. Tests: 1655 / 0 throughout.
108	-
109	-	### Wave 1 — auth/security cluster (8 tiny)
110	-	### Wave 2 — scanning (3)
111	-	### Wave 3 — DB layer polish (4)
112	-	### Wave 4 — storage handlers + admin rescan seal + downloads (5)
113	-	### Wave 5 — UX polish (2)
114	-	### Wave 6 — Performance (3 of 5; 2 deferred)
115	-	- [ ] DEFERRED `build_runner.rs:151` serial-target loop. LOW; refactor touches denominator + error agg + log order. Post-launch.
116	-	- [ ] DEFERRED `scheduler/mod.rs:92-279` advisory-lock granularity. Multi-replica concern; defer until multi-replica is real.
117	-
118	-	### Wave 7 — Payments LOW (2)
119	-	---
120	-
121	-	## Storage code-fuzz 2026-05-31 (post-Run #7)
122	-
123	-	Targeted Storage-axis fuzz to verify A- before triggering full Run #8.
124	-
125	-	### Above-MED fixes that landed
126	-	### Remaining MED/LOW (below A- bar; defer or Phase 4 polish)
127	-	- [ ] Storage MED — `update_project_image_url` / `update_item_cover` ignore `rows_affected()`. Same shape as H1 but only follow-on side-effect is `bump_cache_generation`, so blast radius is small.
128	-	- [ ] Storage MED — `downloads.rs:120` `((duration as u64) * 2).max(3600)` with no DB CHECK on `duration_seconds`. Add `CHECK (duration_seconds >= 0)` migration + cap in code (`duration.max(0).saturating_mul(2).clamp(3600, 86400)`).
129	-	- [ ] Storage MED — Admin rescan (`routes/admin/uploads.rs:347, 390`) bypasses `commit_upload` seal via direct `db::scan_jobs::enqueue`. Demote to `pub(crate)` and expose `commit_rescan(target, ...)`.
130	-	- [ ] Storage MED — `enqueue_s3_orphan` single-policy doc overstates discipline; either tighten doc or migrate remaining direct `delete_object` cleanup sites.
131	-	- [ ] Storage MED — `is_s3_key_live` doesn't enumerate project image URLs (no current bug; surface fragile).
132	-	- [ ] Storage LOW — `scanning/worker.rs:251` inline UPDATE bypasses `db::scanning::update_media_file_scan_status` helper.
133	-	- [ ] Storage LOW — wizard `save.rs:95` updates only `cover_image_url` (not s3_key/size).
134	-	- [ ] Storage LOW — `pending_uploads::remove_pending_upload` deletes by s3_key alone (signature broader than needed).
135	-
136	-	---
137	-
138	-	## Ultra Fuzz 2026-05-31 (Runs #6, #7 + S1)
139	-
140	-	### Structural / chronic-disease fixes that landed
141	-	### Bug-level fixes that landed
142	-	### Deferred (with rationale)
143	-	- [ ] Drop unused `completion_effects` table — schema-only cleanup; harmless empty table.
144	-
145	-	### Notes on remaining MED/LOW (per Run #7 axis reports)
146	-	- Storage MED — admin rescan handlers (`routes/admin/uploads.rs:347, 390`) still call `enqueue_scan_for` indirectly via lower-level primitives; functional today but bypasses the chronic-disease seal.
147	-	- Storage MED — `update_item_cover` / `update_project_image_url` don't check `rows_affected()`; an ownership-filter mismatch returns Ok(0 rows) silently.
148	-	- Storage MED — worker inline media UPDATE at `scanning/worker.rs:251` should use the new `db::scanning::update_media_file_scan_status` helper.
149	-	- Storage LOW — internal CLI confirm drops returned `FileScanStatus` (no `pending_review` surfacing).
150	-	- Storage LOW — `main.rs:334` comment references now-private `enqueue_scan_for`.
151	-	- UX MED — `parse_dollars_to_cents` rejects `"$5"` and `"1,000"` literally; could strip `$`/`,` for clipboard-paste UX.
152	-	- UX MED — project wizard skips `validate_tier_price` ($1–$10k); API path enforces it.
153	-	- UX LOW — `BundleItemIds.filter_map` silently drops malformed UUIDs.
154	-	- Payments M1 — `compute_splits` should `.max(0)` per-member for defense vs legacy negative `split_percent` rows.
155	-	- Payments NIT — extract `require_stripe_ready` helper; six near-identical 5-line blocks across checkout files.
156	-
157	-	---
158	-
159	-	## Ultra Fuzz 2026-05-30 (Run #5)
160	-
161	-	## Ultra Fuzz 2026-05-30 (Run #5)
162	-
163	-	Full report: `docs/audit_review.md`. 3 CRITICAL, 14 HIGH/SERIOUS. Two-axis regressions (Payments B, Storage B-) are coverage expansion into previously-unaudited paths plus one chronic recurrence; Security improved to A-; all 27 Run #4 plan items verified closed.
164	-
165	-	### Phase 1 — CRITICAL (fix today)
166	-
167	-	- [ ] Storage CRIT — `uploads.rs` file-type gate ordering — `routes/storage/uploads.rs:204-237`. Move the match-arm rejection of `Download`/`Insertion`/`MediaImage`/`MediaVideo` BEFORE `enqueue_scan_for` and `update_item_scan_status`. Then make `enqueue_scan_for` + `update_*_scan_status` `pub(crate)` and expose a `commit_upload(file_type, item_id, s3_key)` higher-level op used by all three handlers (uploads / versions / images). Closes Phase 5 chronic invariant-in-prose finding.
168	-	- [ ] UX CRIT — Field-aware validation reaches the UI — `error.rs:216-264` + `templates/error.html`. Either add `fields: Vec<(String, String)>` to `ErrorTemplate` + per-input markup in templates, OR delete the `validation_fields*` API and migrate callers to `validation(summary)`. Audit `validation_fields` callsites and pick a path.
169	-	- [ ] Perf CRIT — `build_runner.rs` partial-failure denominator — `build_runner.rs:175-180`. Track `failed_count`; report `succeeded/(succeeded+failed)`. Add a test with 3 targets / 2 failures asserting "1/3".
170	-
171	-	### Phase 2 — SERIOUS / HIGH (fix this weekend)
172	-
173	-	- [ ] Payments SERIOUS — NULL `item_id` refund decode bomb — `db/transactions.rs:699-716`. Return `Vec<(TransactionId, Option<ItemId>)>`; skip `decrement_sales_count`/`revoke_keys_by_transaction` when None. Fixture test against a project-level transaction.
174	-	- [ ] Payments SERIOUS — `compute_splits` over-credit on members > 100% — `routes/stripe/webhook/checkout_helpers.rs:240-269`. Reject `total_split_pct > 100` at the project_members write site (DB CHECK + validation). Defensively scale or clamp each split. Add test at 60%+60%.
175	-	- [ ] Payments SERIOUS — Tip `project_id` not validated vs recipient — `routes/stripe/checkout/tips.rs:104-106`. After form accept, assert `project.user_id == recipient_id`; 400 otherwise.
176	-	- [ ] Payments SERIOUS — Cart bypasses item `listed` gate — `db/cart.rs:94-123` + `get_cart_items` + `get_cart_items_for_seller`. Add `AND i.listed = true` to all three. Add per-seller checkout path check. Regression test: toggle unlisted item into cart → rejection.
177	-	- [ ] Payments SERIOUS — Unknown subscription status retry storm — `routes/stripe/webhook/subscriptions.rs:117-121`. Replace `?` with a match: known statuses dispatch; unknown statuses `tracing::warn!` and return 200 OK so Stripe stops retrying.
178	-	- [ ] Payments SERIOUS — `is_full_refund` zero-amount — `payments/webhooks.rs:294-308`. Predicate becomes `amount > 0 && amount_refunded >= amount`. Invert the test at line 517-525.
179	-	- [ ] Storage HIGH — `versions.rs` enqueue-before-idempotency — `routes/storage/versions.rs:159-174`. Move `version.s3_key == req.s3_key` idempotency check before `enqueue_scan_for`. Apply Phase 1 `commit_upload` helper.
180	-	- [ ] Storage HIGH — `project_image_confirm` probe-failure + no rollback — `routes/storage/images.rs:179-208`. On `Err`/`Ok(None)` from `s3.object_size`, fall back to recorded size. Move `enqueue_deletions` AFTER `update_project_image_url` success, or wrap in a tx.
181	-	- [ ] Storage HIGH — `media_confirm` non-atomic three-write — `routes/storage/media.rs:236-293`. Wrap `try_increment_storage` → `remove_pending_upload` → `media_files::create` in a transaction. Refund storage credit on any failure.
182	-	- [ ] UX HIGH — Negative/zero prices via `PriceCents::from_db` — `routes/pages/dashboard/wizards/item/save.rs:183-185, 214-227`. Use `PriceCents::new(price_cents)?` unconditionally; drop `> 0` guard. Add `min <= suggested` check on PWYW.
183	-	- [ ] UX HIGH — f64 price parsing accepts NaN/saturates — same file + `routes/api/items/bulk.rs:136-139` + `routes/pages/dashboard/wizards/project.rs:264-298`. Parse as decimal cents (`rust_decimal::Decimal::from_str_exact`); reject NaN/Inf/out-of-range before cast.
184	-	- [ ] UX HIGH — Username live-check fails open on DB error — `routes/auth.rs:356-361`. Propagate error or treat as "unavailable, try again".
185	-	- [ ] Perf HIGH — Cart checkout 80 sequential roundtrips — `routes/stripe/checkout/cart.rs:68-248`. Bulk-load `has_purchased_item` with `WHERE item_id = ANY($1)`. Batch `get_item_by_id`. Claim free items in one tx with batched inserts. Target ≤ 5 roundtrips for any cart size.
186	-	- [ ] Perf HIGH — `record_view` unbounded spawn per request — `db/page_views.rs:18-32`. Replace per-request spawn with `mpsc` channel + single background drainer flushing every 250ms via bulk UPSERT.
187	-	- [ ] Perf HIGH — `check_sales_count_drift` full-table aggregate — `scheduler/integrity.rs:53-73`. Add `WHERE i.sales_count > 0 OR EXISTS(SELECT 1 FROM transactions WHERE item_id = i.id LIMIT 1)` short-term; long-term trigger-maintained counts.
188	-
189	-	### Phase 3 — MED (fix before Run #6 if cheap)
190	-
191	-	- [ ] Storage: advisory-lock leak in `check_sandbox_cap` (`db/mod.rs:92-128`) → `pg_advisory_xact_lock` or RAII guard.
192	-	- [ ] Storage: `is_s3_key_live` missing tables (`db/pending_s3_deletions.rs:67-82`).
193	-	- [ ] Storage: `delete_version` owner SELECT outside tx + post-commit S3 enqueue (`db/versions.rs:267-315`).
194	-	- [ ] Security: ClamAV `FailOpen` startup assertion (`scanning/clamav.rs:19` + `scanning/mod.rs:151-164`) — refuse boot if scan configured but no AV layer live.
195	-	- [ ] Security: `helpers.rs:44-50` `DefaultHasher` → stable hasher (sha2 first 8 bytes or `xxh3` constant seed).
196	-	- [ ] Security: OAuth `state` size cap (`routes/oauth.rs:379-386`) — reject `> 1024`; cap `code_challenge` at 44 chars.
197	-	- [ ] Security: `extract_client_ip` non-Cloudflare fallback warning (`helpers.rs:33-40`).
198	-	- [ ] UX: pagination offset overflow (`routes/pages/public/discover.rs:85-87`, `routes/admin/users.rs:37-39`).
199	-	- [ ] UX: forms silently render without `_csrf` when handler forgets to populate token — make `csrf_token` non-optional in form-bearing templates.
200	-	- [ ] UX: `validate_username` byte-length vs `chars().count()` (`routes/auth.rs:322`).
201	-	- [ ] Perf: scheduler advisory-lock connection pinned across S3 (`scheduler/mod.rs:92-279`) → dedicated `max_connections(1)` pool.
202	-	- [ ] Perf: cleanup S3 deletes serialized inside scheduler tick (`scheduler/cleanup.rs:77-100`) → `for_each_concurrent(8, ...)`.
203	-
204	-	### Phase 4 — Polish (after Run #6 confirms ≥ A-)
205	-
206	-	- [ ] Payments: `has_active_subscription_to_item` period-end clause mirroring (`db/subscriptions.rs:464-470`).
207	-	- [ ] Payments: `get_active_creator_tier` + `sync_user_creator_tier` period-end defense (`db/creator_tiers.rs:91-103, 181-194`).
208	-	- [ ] Payments: `release_use_count` race messaging (`db/promo_codes.rs:184-200`).
209	-	- [ ] Payments: License key `activation_count` recount on revoke (`db/license_keys.rs:343-382`).
210	-	- [ ] Payments: Subscription minimum-charge check (`payments/checkout.rs:283-317`).
211	-	- [ ] Payments: Webhook v1/v2 unmark-on-failure parity (`routes/stripe/webhook/mod.rs:48-86`).
212	-	- [ ] Storage: `media_files.list_folders` scan filter (`db/media_files.rs:73-82`).
213	-	- [ ] Storage: `pending_uploads.record_pending_upload` silent user-mismatch (`db/pending_uploads.rs:23-33`).
214	-	- [ ] Storage: `append_log_bounded` non-atomic size cap (`build_runner.rs:516-534`).
215	-	- [ ] Storage: `downloads.rs:119-122` presigned-URL expiry — cap `duration_seconds` + DB CHECK ≥ 0.
216	-	- [ ] Security: `validate_token_consuming` for OAuth POST (`routes/oauth.rs:206`).
217	-	- [ ] Security: `parse_repo_path` rejects lone-dot entries (`git_ssh.rs:162`).
218	-	- [ ] Security: ClamAV INSTREAM 16K cap → fail-closed on truncation (`scanning/clamav.rs:101-108`).
219	-	- [ ] Security: TOTP seeds at rest behind an application-level key. Currently unencrypted in the DB; `tech/security.md:42-53` already discloses this and commits to a fix. A database-only compromise yields working second factors today.
220	-	- [ ] AI disclosure: render the tier badge on `pages/item.html` + project page (`> [!UI] ai-tier-badges` in `about/generative-ai.md` is unfilled). Show the `ai_disclosure` text for Assisted items above the buy button so fans see it before purchase. Same badge on item cards in Discover results / search hits.
221	-	- [ ] AI disclosure: pick a shape for the Discover filter — current buckets are "All / Handmade / Assisted / Generated"; `about/generative-ai.md` § "How Fans Use This" promises "Handmade only / Human-led / Everything" (Human-led = Handmade ∪ Assisted). Either rewrite the policy to match buckets, or add the combined filter.
222	-	- [ ] AI disclosure: community report endpoint for misclassified items. The policy commits to fan flagging ("Fans and fellow creators can flag items they believe are misclassified.") but there's no `/report` or `/flag` route.
223	-	- [ ] AI disclosure: drop the `checked` default on the publish wizard's tier radios so the creator has to pick deliberately, OR rephrase the policy's "no unlabeled option" to acknowledge default-handmade. Minor; signal-of-intent only.
224	-	- [ ] UX: validation error messages stop reflecting user input (`wizards/item/mod.rs:176-179`).
225	-	- [ ] UX: CSRF body extraction stops using `from_utf8_lossy` (`csrf.rs:528-543`).
226	-	- [ ] Perf: scan-pipeline 400 MiB worst-case capacity note (`constants.rs:156-157`).
227	-	- [ ] Perf: announcement fan-out persistence + resume (`scheduler/announcements.rs:59-89, 147-177`).
228	-	- [ ] Perf: build log per-line DB roundtrip (`build_runner.rs:516-534`).
229	-
230	-	### Phase 5 — Chronic
231	-
232	-	- [ ] Invariant-in-prose, FOURTH consecutive run. Phase 1 #1 (constructive `commit_upload` helper sealing the lower-level scan/credit/status ops) is the only acceptable resolution. After it lands, audit `compute_splits` (Payments) and `ErrorTemplate` (UX) for the same shape and apply the same treatment.
233	-
234	-	---
235	-
236	-	## Ultra Fuzz 2026-05-26 (Run #4)
237	-
238	-	Full report: `docs/audit_review.md`. Plan target: lift every axis back to A- or higher (Payments A · Storage A · UX A- · Security A- · Performance A-).
239	-
240	-	### Phase 1 — clear HIGH/CRITICAL caps (must do before launch)
241	-
242	-	### Phase 2 — close axis-dragging SERIOUS items
243	-
244	-	### Phase 3 — resilience & infra hardening
245	-
246	-	### Phase 4 — polish
247	-
248	-	### Phase 5 — chronic
249	-
250	-	- [~] Invariant-in-prose / policy-not-in-types — third consecutive run (CHRONIC) — scan_status-ordering half closed 2026-05-26 (see Phase 1 entry for `images.rs::item_image_confirm`). The constructive-impossibility shape from the chronic-remediation rubric: `commit__upload` is the only handler-reachable path that writes both row + scan_status; the lower-level scan_status writes were renamed `set__scan_status_standalone` and documented as worker- and admin-override-only. Compiler-driven migration found one additional handler with the same bug (CLI internal upload) — that's the test the rubric wants: structural change exposes drift, not human review. Remaining: `/stripe/*` CSRF policy patchwork — same disease, different organ. Track as Landing 2 below.
251	-
252	-	Follow-ups:
253	-	- [ ] Manual-posture runtime assertion (dev builds). Today `*_csrf_manual` requires no compile-time proof that the handler called `validate_token_consuming`. Only the tip handler is Manual, and `_validated` is bound only as documentation. In dev/test builds, set a flag in `validate_token_consuming` and debug-assert it after the handler runs; mismatched routes panic loudly in CI without affecting prod. Not blocking — only matters if Manual grows beyond one route.
254	-	- [ ] Phase 1 entries still open: `cancel_pending_item_checkout` Skip reason is `"Phase 1 todo: tighten to post_csrf"` (grep "Phase 1 todo" to find). `/login` and `creator-tier` template tightening tracked separately above.
255	-
256	-	### Notes & non-actions
257	-
258	-	- Status-notification fan-out cooldown across overlapping tasks (`monitor.rs:213-237`) — single-replica today; harmless. Reconsider when adding a second instance.
259	-	- `record_storage_fill_stats` JOIN (`metrics.rs:181-218`) — 5min cadence is acceptable at 100k users; revisit at 1M.
260	-	- `metadefender` could run concurrently with MalwareBazaar in suspicion path (`scanning/mod.rs:377-398`) — micro-optimization, deferred.
261	-	- `populate_known_sync_apps` startup-only (`rate_limit.rs:65-85`) — paired with the deletion-path item above; together they're a single fix.
262	-
263	-	---
264	-
265	-	## Ultra Fuzz 2026-05-26 (Run #3)
266	-
267	-	Full report: `docs/audit_review.md`. Plan target: lift every axis to A- or higher (Payments A · Storage A- · UX A · Security A- · Performance A-).
268	-
269	-	### Notes & non-actions
270	-
271	-	- Backup-code fast-path malformed-hash trap (`db/totp.rs:155-189`) — log + alert + fall through to legacy path; small, file as polish.
272	-	- `session_cache` TTL window vs admin revoke (`auth.rs:154-191`) — documented as intentional; consider exposing a broadcast invalidate op if operator demand emerges.
273	-	- `monitor.rs` `pg_stat_activity` cadence already covered by Phase 2 split.
274	-
275	-	---
276	-
277	-	## Ultra Fuzz 2026-05-25 (Run #2)
278	-
279	-	Full report: `docs/audit_review.md`.
280	-
281	-	### Outbox follow-ups — convert remaining webhook handlers
282	-
283	-	All five remaining handlers converted to outbox 2026-05-25; migration 125 added `fan_plus_subscription_id` and `creator_subscription_id` parents so each subscription type has its own idempotency anchor.
284	-
285	-	### Current phase — serious / high
286	-
287	-	- [ ] Login template field-aware errors — deferred 2026-05-26. Re-scoped: error-construction infra (`AppError::validation_fields`) is in place in `join_wizard::step_account_create`, but neither signup nor login renders per-field highlights yet. Real work is a new HTMX partial with OOB swaps per input + per-field error containers on both templates. Login itself has only one safe per-field message by design (creds are intentionally generic to avoid enumeration); the value is mostly on the signup side.
288	-	- [~] Scanning peak memory — `scanning/mod.rs:174` already uses `std::sync::Arc::<[u8]>::from(data)` which dispatches through `Vec::into_boxed_slice` and reuses the allocation; SHA-256 streams via `Sha256::update` over the same Arc-shared buffer. No change needed.
289	-	- [~] `check_sales_count_drift` full GROUP BY — the SQL already filters via `HAVING i.sales_count != COUNT(t.id)` (the real bound). The trailing `LIMIT 50` is a per-tick cap on how many drifts to surface, not a cosmetic post-group filter. No action.
290	-
291	-	### Current phase — medium / minor
292	-
293	-	- [~] `pg_stat_activity` baseline load — `monitor.rs:290-294` doc explicitly justifies the 30 s cadence for operator-dashboard refresh; no change.
294	-
295	-	### Deferred — architectural
296	-
297	-	- [ ] Cloudflare-only origin: migrate custom domains to CF for SaaS, then firewall 80/443. Re-scoped 2026-05-26. The original sketch (firewall the origin to CF IP ranges) conflicts with the shipped custom-domain feature (`api/domains.rs` + Caddy `on_demand_tls`), which expects creators' A-records to hit the origin directly. The two threats the firewall was meant to close are already mitigated at layer 7 — `CloudflareIpKeyExtractor` peer-IP fallback (landed 2026-05-26) closes the CF-Connecting-IP spoofing surface; Caddy `client_auth require_and_verify` closes the WAF-bypass surface for `makenot.work`. The proper sequencing is now (1) upgrade to CF Business for CF-for-SaaS, (2) reconfigure CF dashboard with a fallback origin, (3) update `api/domains.rs` onboarding to CNAME instead of A-record, (4) migrate the 1 live custom-domain creator, (5) drop `on_demand_tls` from `Caddyfile`, (6) apply the firewall ACL. Full sequence + ACL sketch + gotchas live in `_meta/docs/incident_response.md` § "Pending Hardening: Cloudflare-only origin firewall". Blocked on the CF plan upgrade + 1 customer email, neither of which happens in-session.
298	-
299	-	---
300	-
301	-	## Ultra Fuzz 2026-05-24 (Run #1)
302	-
303	-	Full report: `docs/audit_review.md`.
304	-
305	-	### Current phase — medium
306	-
307	-	- [~] Status notification parallel fan-out — kept sequential with 100 ms shaper; that pacing is intentional (SMTP rate-limit shape). No change.
308	-
309	-	### Deferred — architectural
310	-
311	-	All four Run #1 deferred items closed by Run #2 sweeps; pointers below.
312	-
313	-	---
314	-
315	-	## Creator applications restructure (replaces waitlist)
316	-
317	-	Discussed and scoped 2026-06-03; no implementation yet. Rename and generalize the existing waitlist into a creator-applications system that lives inside the join wizard, replaces the standalone `/admin/waitlist` surface, and gives fans a settings-page path to apply after the fact. The trigger to start: when the founder cohort fill is no longer well-served by the current waitlist UX, or before opening signup beyond hand-picked invitations — whichever comes first.
318	-
319	-	### Model
320	-
321	-	Three branches in the wizard, decided up front by the signing-up account:
322	-
323	-	- Free trial — short pitch (1–2 sentences: what you make, why MNW). Account exists, `can_create_projects = false`, application status `pending`. Operator approval flips it.
324	-	- Benefits account — longer disclosure (community / mission alignment, the binding-mission-statement framing from the program docs). Same `pending` state, different `application_type` so the admin queue can sort them.
325	-	- Just pay — skip the application entirely, route to Stripe checkout. No approval required — paying is the signal. On subscription activation, `can_create_projects = true` immediately. No `creator_applications` row written.
326	-
327	-	Founder rate (50% off, locked for life when window closes) is available on any branch during the cohort window — free-trial seats get $0, benefits seats may be subsidized, paid seats get the standard founder discount.
328	-
329	-	### Schema migration
330	-
331	-	- [ ] Replace `creator_waitlist` with `creator_applications`. Either rename the table + add columns, or create a sibling and migrate rows. Add `application_type` enum column (`free_trial` \| `benefits_account`). Normalize `status` values to `pending` \| `approved` \| `declined` \| `spam`.
332	-	- [ ] Backfill existing waitlist rows as `application_type = 'free_trial'` (preserve `pitch`, `created_at`, decision metadata, `selection_method`, `invited_by_user_id`).
333	-	- [ ] Drop the `db::waitlist` module + its consumers once nothing references it. The `grant_creator_access` helper is the right primitive to keep — move it under `db::creator_applications`.
334	-
335	-	### Wizard
336	-
337	-	- [ ] Insert a new "Choose your entry" step in `join_wizard.rs` flow after profile, before pitch. Three radio options + short descriptions. The chosen branch threads through to the next step.
338	-	- [ ] Rebuild `wizards/steps/join/pitch.html` to branch on `application_type` — different prompt text, different length limits (free-trial short, benefits longer).
339	-	- [ ] Route the paid branch around the application step entirely: profile → Stripe checkout → complete. On webhook activation, no `creator_applications` row is written; `can_create_projects` is granted on `creator_subscriptions.status = 'active'`.
340	-	- [ ] Rewrite `wizards/steps/join/complete.html` so the "Apply for creator access" framing is gone (the question was already answered upstream). Free-trial / benefits accounts see "Application under review"; paid accounts see "Welcome — create your first project."
341	-
342	-	### Dashboard / settings
343	-
344	-	- [ ] New `/settings/creator-access` page for fan-only accounts to submit an application after the fact. Same three branches, same pitch requirements. Lives in the dashboard tab rail, not a marketing page.
345	-	- [ ] Strip the five existing "Apply for Creator Access" CTAs (`partials/tabs/user_projects.html` x2, `partials/tabs/user_creator.html` heading, `pages/creators.html` step list, `wizards/steps/join/complete.html` card). Replace dashboard surfaces with a small "Apply for creator access" link that routes to `/settings/creator-access`. Marketing page (`creators.html`) drops the "apply from your dashboard" framing in favor of "start your free trial during signup."
346	-
347	-	### Pending UX
348	-
349	-	- [ ] Accounts in `pending` status can browse, buy items as a fan, manage profile and settings — they just can't reach creator dashboards. Existing `can_create_projects` guards already block project creation; new behavior is to render an "Application under review" panel (with submitted pitch + submission date) instead of returning 404 or redirecting away.
350	-	- [ ] Email notification on approve / decline, distinct templates per `application_type`. Decline template names the reason; approval template points at the dashboard.
351	-
352	-	### Admin
353	-
354	-	- [ ] Rename `routes/admin/waitlist.rs` → `routes/admin/applications.rs`. Generalize the approve / decline / spam handlers to read `application_type`. The `grant_creator_access` call on approve stays as-is.
355	-	- [ ] Rename `dashboards/admin-waitlist.html` → `dashboards/admin-applications.html`. Add an `application_type` column and a type filter (free_trial / benefits_account / all). The existing stats block (pending / approved / spam / total_creators counts) stays; queries adjust to read the new table.
356	-	- [ ] Update admin navigation (`admin_active_page: "waitlist"` → `"applications"`) and any cross-links in the admin shell.
357	-	- [ ] Sitemap entry update, breadcrumb update.
358	-
359	-	### Tests + acceptance
360	-
361	-	- [ ] Pin: a `pending` account cannot create projects (the existing `can_create_projects` guard already enforces this; verify the rendered panel works).
362	-	- [ ] Pin: a "just pay" signup lands at `can_create_projects = true` AND an active Stripe subscription AND no `creator_applications` row.
363	-	- [ ] Pin: admin queue lists pending applications sorted by `application_type` then `created_at`.
364	-	- [ ] Pin: every removed "Apply for Creator Access" string is gone from `templates/` (grep test in `tests/regression/`).
365	-	- [ ] Pin: founder rate applies regardless of branch during the cohort window (existing founder-pricing logic; new test asserts the multiplier doesn't depend on `application_type`).
366	-
367	-	### Out of scope (this restructure)
368	-
369	-	- Partnership / sponsorship / residency / fellow-led-project applications — those continue via email, no form surface built. If we ever build them, the same `creator_applications` table can host them as additional `application_type` variants.
370	-
371	-	## PoM contract guard (landed 2026-05-25)
372	-
373	-	Schema-drift guard test wired against `shared/pom-contract/`: `src/routes/pages/public/health/mod.rs::tests::pom_hetzner_health_expectations_resolve`. `health_json` body builder extracted as pure `health_json_body(overall, db_ok)` for the test. Catches the v0.5.16-class drift where a field is removed from `/api/health` without updating PoM's expectations. See `MNW/CLAUDE.md` § PoM Health Contract.

D shared/theme-common/todo.md -31

		@@ -1,31 +0,0 @@
1	-	# theme-common TODO
2	-
3	-	## Active
4	-
5	-	(none — crate is stable, consumed by audiofiles + goingson)
6	-
7	-	## Future — Unified theme library + per-user MNW theming
8	-
9	-	Scope: One canonical theme library serving every app under Make Creative (audiofiles, goingson, balanced_breakfast, MNW server-rendered UI), plus a Fan+ perk that lets MNW users override the platform default CSS per-account.
10	-
11	-	Why:
12	-	- Today every app maintains its own theme folder: `Apps/audiofiles/crates/audiofiles-browser/themes/` (28 themes), `Apps/goingson/src-tauri/frontend/themes/helix/` (9 themes), `MNW/shared/themes/` (28 themes). Drift is inevitable; theme counts are already wrong in 5+ docs.
13	-	- A unified library shipped from `MNW/shared/themes/` (or a successor crate here in `theme-common/`) would centralize the source of truth and let documentation point at one directory.
14	-	- Theming the MNW default CSS per user is a natural Fan+ benefit: pay tier gets a theme picker + custom CSS slot stored against the account, applied via `<link rel="stylesheet">` injected into every authenticated page render.
15	-
16	-	Pieces to design:
17	-	- Single canonical theme TOML schema (audit existing helix-format vs MNW theme TOMLs for shape divergence).
18	-	- A loader contract every app already supports (theme-common already does this for native apps; MNW server-side rendering needs an equivalent for CSS variable injection into Askama templates).
19	-	- Per-user storage: new column on the users table (or a separate `user_themes` table) holding either a theme ID + custom CSS overrides, or a full custom theme blob (with size cap and validation).
20	-	- Fan+ gating: theme picker visible to all signed-in users (apply a built-in theme); custom CSS slot gated behind Fan+ status.
21	-	- CSS sanitization for the custom-CSS slot — accept only declarations, no `@import`, no `url()` to off-origin, no `expression()` (defunct but defensive). Probably easier to whitelist a CSS property allowlist than to sanitize freeform.
22	-	- Migration path: drop hard theme counts from app READMEs in favor of "see the themes directory" (already done for the launch).
23	-
24	-	Not Monday work. Surfaced here so it's tracked. The Monday docs drop hard counts and point at directories — that posture is already correct for whenever this lands.
25	-
26	-	Key paths:
27	-	- `MNW/shared/theme-common/` (this crate — likely host for the canonical loader)
28	-	- `MNW/shared/themes/` (current cross-app theme store)
29	-	- `Apps/audiofiles/crates/audiofiles-browser/themes/` (consumer; would migrate)
30	-	- `Apps/goingson/src-tauri/frontend/themes/helix/` (consumer; would migrate)
31	-	- `MNW/server/templates/` (would gain user-CSS injection hook)