Skip to main content

max / makenotwork

5.3 KB · 151 lines History Blame Raw
1 # sando
2
3 Home-rolled CI/CD controller for the MNW server. Axum daemon (`sandod`) +
4 ratatui TUI (`sando`). Gates a tiered deploy flow:
5
6 ```
7 git push mm -> MakeMachine (build + tests + migration dry-run + boot smoke)
8 -> A (testnot.work)
9 -> B (prod-1)
10 -> C (prod-2)
11 ```
12
13 Each tier's progression gates are declared in `sando.toml`. Tiers and nodes
14 live in the TOML, not in code — adding a node or a new tier is a config edit.
15
16 ## Crates
17
18 | Path | Binary | Role |
19 |------|--------|------|
20 | `daemon/` | `sandod` | Axum daemon. Runs on the MakeMachine. Owns SQLite state, the bare git repo, and all build/gate/deploy logic. |
21 | `tui/` | `sando` | ratatui front-end. Runs on the laptop. Talks to `sandod` over the tailnet. |
22
23 ## Quickstart: localhost dev loop
24
25 The MakeMachine hardware does not exist yet, so v0 runs entirely on a single
26 host. Bare repo, releases dir, "remote" A node — everything is a local
27 directory.
28
29 ```bash
30 # 1. Build both binaries.
31 cd MNW/sando/daemon && cargo build
32 cd ../tui && cargo build
33
34 # 2. Create a workspace + config.
35 mkdir -p /tmp/sando-dev
36 cat > /tmp/sando-dev/daemon.toml <<EOF
37 listen = "127.0.0.1:7766"
38 db_path = "/tmp/sando-dev/sando.db"
39 topology_path = "/tmp/sando-dev/sando.toml"
40 workdir = "/tmp/sando-dev/work"
41 release_root = "/tmp/sando-dev/releases"
42 # scratch_db_url = "postgres://you@127.0.0.1/sando_scratch"
43 EOF
44
45 cat > /tmp/sando-dev/sando.toml <<EOF
46 [repo]
47 bare_path = "/tmp/sando-dev/mnw.git"
48 branch = "main"
49 [backup]
50 source = "file:///tmp/sando-dev/fake-backup.sql"
51 local_path = "/tmp/sando-dev/backup.sql"
52
53 [[tier]]
54 name = "mm"
55 provisioned = true
56 gates = [
57 { kind = "cargo_test" },
58 { kind = "migration_dry_run" },
59 { kind = "boot_smoke" },
60 ]
61
62 [[tier]]
63 name = "a"
64 provisioned = true
65 canary = "sequential"
66 gates = [
67 { kind = "boot_smoke" },
68 { kind = "manual_confirm" },
69 ]
70 [[tier.node]]
71 name = "a-local"
72 ssh_target = "local"
73 release_root = "/tmp/sando-dev/a-node"
74 EOF
75
76 # 3. Run the daemon.
77 SANDO_CONFIG=/tmp/sando-dev/daemon.toml \
78 ./MNW/sando/daemon/target/debug/sandod
79
80 # 4. In another shell: point a clone at the bare repo and push.
81 git clone /tmp/sando-dev/mnw.git /tmp/sando-dev/checkout
82 # ... add a `server/Cargo.toml` + source so the build can run ...
83 cd /tmp/sando-dev/checkout && git push origin main
84
85 # 5. Watch the TUI.
86 SANDO_DAEMON=http://127.0.0.1:7766 ./MNW/sando/tui/target/debug/sando
87 ```
88
89 When you push, the bare repo's `post-receive` hook (installed automatically
90 by `sandod` on startup) calls `POST /rebuild`. The daemon checks out the
91 sha, runs `cargo build --release` against `server/`, stages the binary in
92 `releases/<version>/server`, then runs the MM tier's gates. On green, MM's
93 `tier_state` advances. Promote with:
94
95 ```bash
96 curl -X POST http://127.0.0.1:7766/promote/a \
97 -H 'Content-Type: application/json' \
98 -d '{"version":"0.8.2"}'
99 ```
100
101 ## API
102
103 | Method | Path | Body | Purpose |
104 |--------|------|------|---------|
105 | GET | `/state` || Tier list + current/previous version + last gate outcomes |
106 | POST | `/rebuild` | `{sha?: string}` | Force a build; if `sha` is absent, resolves the configured deploy branch. Aborts any in-flight build (latest wins). |
107 | POST | `/promote/{tier}` | `{version?, hotfix?, reset_burn_in?}` | Verify predecessor gates, deploy to tier nodes, advance state. `version` defaults to the predecessor tier's `current_version`. |
108 | POST | `/rollback/{tier}` || Swap `current` symlink to `previous_version` on every node in the tier |
109 | POST | `/confirm/{tier}` || Insert a passing `manual_confirm` gate row for the tier's `current_version`. Replaces hand-SQL. |
110 | POST | `/backup/fetch` || Pull the prod backup. Supports `file://`, `rsync://`, `ssh://user@host[:port]/path`. |
111 | GET | `/metrics` || Prometheus exposition |
112 | GET | `/events` || WebSocket stream of typed events (RebuildRequested, BuildStart/Ok/Failed, GateStart/Done, DeployStart/Ok/Failed, PromoteComplete, Rollback, BackupFetched, ManualConfirm, BuildAborted). |
113
114 ## TUI
115
116 `sando` (the TUI binary) connects to `$SANDO_DAEMON` (default `http://127.0.0.1:7766`), polls `/state` every 2s, and subscribes to `/events` over WS. Keybindings:
117
118 | key | action |
119 |-----|--------|
120 | ↑/↓ or j/k | select tier |
121 | p | `POST /promote/<selected>` (no body — version defaults to predecessor's current) |
122 | R | `POST /rollback/<selected>` |
123 | b | `POST /backup/fetch` |
124 | c | `POST /confirm/<selected>` |
125 | r | refresh hint (poller is already every 2s) |
126 | q / Esc / Ctrl-C | quit |
127
128 Action results show up in the events log a moment later (the actions themselves emit events from the daemon side).
129
130 ## Hotfix flow
131
132 `POST /promote/{tier}` accepts:
133
134 - `hotfix: true` — skips the `burn_in` gate on the predecessor tier only. All
135 other gates still apply.
136 - `reset_burn_in: true` (default `false`) — additionally nulls
137 `tier_state.burn_in_started_at` on the source tier, restarting the clock
138 for whatever else is still burning in there. Use this only when the hotfix
139 meaningfully changes the surface area under burn-in.
140
141 ## v0 limitations
142
143 - `migration_dry_run` requires a scratch Postgres at `scratch_db_url`. The
144 gate drops every non-system schema on every run; do not point this at
145 anything that matters.
146
147 ## License
148
149 MIT. The surrounding MNW monorepo is PolyForm-Noncommercial — sando is
150 deliberately MIT'd because it's deploy infra, not the product.
151