Skip to main content

max / makenotwork

sando: capture Session 1 plan — ship the full versioned bundle Result of the design step-back after Phase A landed. Supersedes the (a)/(b) tier-B decision in launchplan_final §6.5 with a five-category inventory of what prod actually serves and a layout that lets sando own exactly the right subset (code + version-coupled content). Three sessions ahead: 1. sando learns to ship the bundle (no prod changes) — this plan 2. testnot migration (low-risk practice + unblocks tier-A exercise) 3. prod migration (the careful one) Plan is self-contained: investigate step, code-change list with file paths, on-MM test procedure, acceptance criteria, out-of-scope notes, open questions to answer during the session itself.
Author: Max Johnson <me@maxj.phd> · 2026-06-03 01:26 UTC
Commit: 3548282bd9f3609d39b08482113ca507d66621f1
Parent: 376e2a2
1 file changed, +199 insertions, -0 deletions
@@ -0,0 +1,199 @@
1 + # Session 1 — Sando ships the full versioned bundle
2 +
3 + Plan captured 2026-06-02 after the design-step-back conversation following Phase A landing and the cargo_test/MM diagnosis push. Resolves the tier-B-strategy decision (§6.5 step 6 of `launchplan_final.md`) by going past the (a1)/(a2)/(b) trichotomy to a proper layout redesign.
4 +
5 + Status: ready to pick up. Open questions in §F must be answered during Session 1 itself.
6 +
7 + ## Background — the trade as it stood
8 +
9 + Today sando ships only `{makenotwork, mnw-admin, error-pages}` into `releases/<v>/`. Prod's `/opt/makenotwork/` also contains `docs/`, `static/`, `yara-rules/`, `.env`, `backups/`, `scan-spool/` — none of which sando manages. `deploy.sh` ships content; sando ships binaries. The split is the source of the friction we keep hitting.
10 +
11 + The right answer isn't (a1) "extend sando to ship everything in one piece" or (a2) "leave content on the old path forever" — it's a layout that separates the five things prod actually serves and lets sando own exactly the right subset.
12 +
13 + ## Layout
14 +
15 + ```
16 + /opt/mnw/releases/<v>/ # sando-managed: code + version-coupled content
17 + ├── makenotwork
18 + ├── mnw-admin
19 + ├── static/
20 + ├── docs/
21 + └── error-pages/
22 + /opt/mnw/current → releases/<v>/ # atomic-swap symlink
23 + /opt/mnw/yara-rules/ # operator-managed (separate cadence)
24 + /etc/mnw/makenotwork.env # operator-managed secrets
25 + /var/lib/mnw/ # runtime state
26 + ├── backups/
27 + ├── scan-spool/
28 + └── git/ # GIT_REPOS_PATH (moved from /opt/git)
29 + ```
30 +
31 + Systemd unit:
32 +
33 + ```
34 + ExecStart=/opt/mnw/current/makenotwork
35 + WorkingDirectory=/opt/mnw/current
36 + EnvironmentFile=/etc/mnw/makenotwork.env
37 + ReadWritePaths=/var/lib/mnw /opt/mnw/yara-rules
38 + StateDirectory=mnw/scan-spool
39 + ```
40 +
41 + ## Principle
42 +
43 + A sando "release" is the atomic versioned bundle in category #2 of the inventory below — code + version-coupled content. Anything that doesn't change in lockstep with the binary doesn't belong inside the release dir. Secrets and state live on paths sando never touches. This is what gives atomic rollback its actual value: one symlink swap moves every version-coupled thing together.
44 +
45 + ### The five categories (what prod actually serves)
46 +
47 + 1. **Versioned code** — `makenotwork`, `mnw-admin`. Tied to a git sha.
48 + 2. **Versioned content compiled-against-the-binary** — `static/` (cache-busted via compiled-in version), `error-pages/`, `docs/` (DocEngine content, askama-template-coupled). Skew between these and the binary is a bug class.
49 + 3. **Versioned content with its own update cadence** — `yara-rules/` (malware-scan rules, gated by SCAN_ENABLED). Updates independently of MNW releases.
50 + 4. **Secrets / config** — `.env`, 45 keys. Outlives every version; never in the repo.
51 + 5. **Runtime state** — `backups/`, `scan-spool/`, `git/`. Lives outside any version.
52 +
53 + Sando owns 1+2 as one atomic bundle. 3 stays operator-managed (revisit later if a real incident makes atomic yara-rules rollback necessary). 4 and 5 are not sando's concern.
54 +
55 + ## Session goal
56 +
57 + Sando produces a staged release dir with the full 1+2 bundle on the MM tier. No prod or testnot changes. Validate by `POST /rebuild` against MNW main and inspecting `/srv/sando/releases/<v>/`.
58 +
59 + ## A. Investigate first (~30 min)
60 +
61 + 1. `MNW/shared/docengine/` — find the build entry point. Is it a `cargo build -p docengine` producing a binary? A `build.rs` step in `server/`? A separate `docengine compile` command run over `site-docs/`? Determines whether sando's `build.rs` invokes cargo once or twice. If DocEngine compile is a cargo step, it falls out of the existing `cargo build --release`; if it's a separate binary that runs over `site-docs/`, sando needs to invoke it after the cargo build.
62 + 2. Confirm `server/static/` is real static (no preprocessing). Probably true — server/CONTRIBUTING.md notes "New JS goes in server/static/ as static files".
63 + 3. `mnw-admin` callers — grep `mnw-admin` across `server/` to confirm what invokes it. Per the user: build scripts + possibly the admin page. Those callers need to update to `/opt/mnw/current/mnw-admin` post-migration. Not a Session 1 change, just a note for Session 3.
64 +
65 + ## B. Sando code changes (`MNW/sando/daemon/src/`)
66 +
67 + ### `build.rs::build_and_run_mm`
68 +
69 + Currently stages `<release>/{makenotwork, mnw-admin, error-pages/}` via `deploy::deploy_local` (binaries) + a `cp -a` for error-pages. Extend to also stage:
70 +
71 + - `<release>/static/` ← `worktree/server/static/` (cp -a)
72 + - `<release>/docs/` ← `worktree/server/target/release/docs/` (assuming DocEngine outputs there) or `worktree/server/site-docs/` raw, depending on §A.1.
73 +
74 + Refactor: pull the per-asset `cp -a` calls into a single helper
75 +
76 + ```rust
77 + fn stage_dir(src: &Path, dst: &Path, required: bool) -> Result<()> { ... }
78 + ```
79 +
80 + so the four sites (error-pages, static, docs, future) read uniformly. Missing-source policy: required=true errors; required=false logs warn and skips (so older shas without one of these don't break sando mid-bisect).
81 +
82 + ### `config.rs`
83 +
84 + Replace the `bin_names: Vec<String>` "primary binary" framing with two fields:
85 +
86 + ```rust
87 + pub bin_names: Vec<String>, // what cargo produces under target/release/
88 + pub release_contents: Vec<ReleaseEntry>, // what gets staged from worktree → release dir
89 +
90 + pub struct ReleaseEntry {
91 + pub src: PathBuf, // relative to worktree root
92 + pub dst: PathBuf, // relative to release dir
93 + pub required: bool,
94 + }
95 + ```
96 +
97 + Default value for MNW lives in `sando-daemon.toml`, not hard-coded in sando:
98 +
99 + ```toml
100 + release_contents = [
101 + { src = "server/deploy/error-pages", dst = "error-pages", required = false },
102 + { src = "server/static", dst = "static", required = true },
103 + { src = "server/target/release/docs", dst = "docs", required = true },
104 + ]
105 + ```
106 +
107 + This pulls MNW-specific knowledge out of sando code into sando config — closer to what sando wants to be as a generic deploy controller. Tests can override with a fixture release_contents.
108 +
109 + ### `deploy.rs::deploy_node`
110 +
111 + Already rsyncs the whole staged dir, so no changes once `build.rs` stages more into it. Verify rsync flags include `-a` and `--delete` so removed assets don't accumulate across versions (e.g. a deleted static asset).
112 +
113 + ### `bootstrap-node.sh`
114 +
115 + Unit template now writes:
116 +
117 + ```
118 + ExecStart=/opt/mnw/current/makenotwork
119 + WorkingDirectory=/opt/mnw/current
120 + EnvironmentFile=/etc/mnw/makenotwork.env
121 + ReadWritePaths=/var/lib/mnw /opt/mnw/yara-rules
122 + StateDirectory=mnw/scan-spool
123 + ```
124 +
125 + Plus `install -d -o root -g $SERVICE_USER -m 0750 /etc/mnw /var/lib/mnw` upfront. Drop the `EnvironmentFile=-/opt/mnw/.env` default — env file lives at `/etc/mnw/makenotwork.env` now.
126 +
127 + ### Tier rename `mm` → `host`
128 +
129 + Folded in here while editing topology code. Schema migration in `sando/daemon/migrations/` renames the `tiers` row + any FK references in `tier_state`, `gate_runs`, `deploys`. Plus code sweep: `build.rs`, `routes.rs`, `sando.toml`, test fixtures. Add a defensive assert that loudly fails if `tier='mm'` is still queried anywhere — silent miss would be worse than a panic during the transition.
130 +
131 + ~30 lines of grep+replace + 1 sqlite migration.
132 +
133 + ## C. Test on MM
134 +
135 + 1. `cargo test --release --features fast-tests` in `sando/daemon/` — all existing tests pass + any new staging tests added (e.g. `stage_dir copies on success`, `stage_dir errors when required-missing`, `release_contents config parses`).
136 + 2. Build sandod, install, restart on pop-os.
137 + 3. `POST /rebuild` against current MNW main.
138 + 4. Inspect `/srv/sando/releases/<v>/` — should contain `makenotwork`, `mnw-admin`, `error-pages/`, `static/`, `docs/`. Verify total size is sane (single-digit MB for binary, tens of MB for static+docs).
139 + 5. boot_smoke still passes (binary doesn't care what's in the dir alongside it).
140 + 6. cargo_test still green.
141 +
142 + ## D. Acceptance
143 +
144 + - A `/rebuild` against MNW main produces a staged release with all four asset categories (binaries, error-pages, static, docs).
145 + - Existing MM gates stay green.
146 + - `/srv/sando/releases/<v>/` is self-contained: deleting `/srv/sando/work/<sha>/` and re-running boot_smoke against the staged binary works. Sanity check that no path in the staged tree reaches back into the worktree.
147 +
148 + ## E. Out of scope for Session 1 (Session 2/3)
149 +
150 + - Any change to testnot or prod
151 + - Moving `/opt/git`, `/opt/makenotwork/.env`, state dirs
152 + - Authorizing sando's pubkey on prod
153 + - Renaming the systemd service path
154 + - yara-rules — stays operator-managed; out of sando entirely
155 + - Caddy config (operator-managed; uses bundle path indirectly via `localhost:3000` reverse_proxy + per-dir paths that point at `/opt/mnw/current/...` post-migration)
156 +
157 + ## F. Open questions to answer during Session 1
158 +
159 + - **Total bundle size after staging.** If it's >50MB the rsync time per deploy gets noticeable. Worth measuring; not blocking. Sets expectations for Session 2/3.
160 + - **DocEngine build path.** Whether `cargo build --release` already produces `target/release/docs/` (if DocEngine is a build-script step) or needs an explicit `cargo run -p docengine -- build` (if it's a separate command). Determines whether `build.rs` in sando does one cargo invocation or two.
161 + - **DocEngine output format.** Is `target/release/docs/` a directory tree of HTML/assets ready to serve, or a single bundle file? Affects `stage_dir` semantics.
162 +
163 + ## Sessions 2 and 3 (out of scope but for context)
164 +
165 + **Session 2 — testnot migration (low-risk practice).**
166 + - `bootstrap-node.sh` on testnot with the new unit shape.
167 + - Write `/etc/mnw/makenotwork.env` on testnot from scratch — this is what was missing during the 2026-06-02 tier-A exercise attempt.
168 + - `POST /promote/a` from sando → boots green for the first time.
169 + - Exercise §6.5 step 3 tier-A flow we skipped.
170 +
171 + **Session 3 — prod migration (the careful one).**
172 + - Inventory + dry-run plan (most of inventory done 2026-06-02; one more pass for exact mv/install sequence).
173 + - Lock the deploy.sh path during the migration window.
174 + - Stop makenotwork.service.
175 + - `bootstrap-node.sh` on prod with `SANDO_PUBKEY=…`, creating `deploy` user + `/opt/mnw/` + new unit.
176 + - `mv /opt/makenotwork/.env /etc/mnw/makenotwork.env`.
177 + - `mv /opt/makenotwork/{backups,scan-spool}` → `/var/lib/mnw/` (or rebuild scan-spool — it's transient).
178 + - `mv /opt/git /var/lib/mnw/git` + update `GIT_REPOS_PATH` in `/etc/mnw/makenotwork.env`.
179 + - Audit all PATH-typed env keys against the new layout: `DOCS_PATH=/opt/mnw/current/docs`, `YARA_RULES_DIR=/opt/mnw/yara-rules`, `ASSUMPTIONS_PATH=/opt/mnw/current/docs/business/assumptions.toml`, etc.
180 + - Update build scripts on prod that invoke `mnw-admin` to use `/opt/mnw/current/mnw-admin`.
181 + - `POST /promote/b {"hotfix":true}` from sando → first sando deploy to prod.
182 + - Start makenotwork.service under the new layout.
183 + - Verify makenot.work end-to-end.
184 + - Soak for a week.
185 + - `rm -rf /opt/makenotwork/` after the soak; archive `deploy.sh` as break-glass-only.
186 +
187 + ## Key paths (for Claude orientation)
188 +
189 + - `MNW/sando/daemon/src/build.rs` — where staging happens (`build_and_run_mm`).
190 + - `MNW/sando/daemon/src/config.rs` — Config struct, `bin_names`.
191 + - `MNW/sando/daemon/src/deploy.rs` — `deploy_local`, `deploy_node`.
192 + - `MNW/sando/deploy/bootstrap-node.sh` — unit template.
193 + - `MNW/sando/sando.toml` — topology config (tier names live here).
194 + - `MNW/sando/daemon/sando-daemon.toml` — daemon config (release_contents will go here).
195 + - `MNW/sando/daemon/migrations/` — sqlite migrations for the mm→host rename.
196 + - `MNW/server/static/`, `MNW/server/site-docs/`, `MNW/server/deploy/error-pages/` — staging sources.
197 + - `MNW/shared/docengine/` — DocEngine crate (investigate in §A.1).
198 + - `launchplan_final.md` §6.5 — original tier-B decision context this redesign supersedes.
199 + - `MNW/sando/plans/config-artifacts.md` — earlier Phase 3 design doc on config vs binary artifacts; complementary background.