max / makenotwork

3.6 KB · 80 lines History Blame Raw

1	# SSH access — production server (Hetzner)
2
3	Two SSH paths into the production server, with different audiences and
4	different break-glass behavior. Read this before disabling either one.
5
6	## The two paths
7
8	\| Port \| Interface \| Audience \| Allowed from \|
9	\|------\|-----------\|----------\|--------------\|
10	\| `22` \| public (eth0) \| mnw-cli git operations \| anywhere (firewall + sshd config) \|
11	\| `2200` \| Tailscale (tailscale0) \| admin access \| tailnet only (firewall blocks public) \|
12
13	- Public :22 is intentionally open so creators can `git clone`/`push`
14	over SSH against `ssh.makenot.work`. The sshd config on this port is
15	locked to git-shell only — see `setup-git-ssh.sh` and `sshd-git.conf`.
16	No interactive shell, no port forwarding, no admin access.
17	- Tailnet :2200 is the admin path. Full interactive shell, used for
18	every `deploy.sh` invocation and any manual maintenance. Reachable only
19	from devices on the tailnet (firewall rule `ufw allow in on tailscale0`
20	in `setup-firewall.sh`).
21
22	## Why this split exists
23
24	The audit-flagged risk was disabling Tailscale SSH (the admin path)
25	without first verifying the public sshd was still functional. Tailscale
26	runs its own SSH server when configured; if that goes down — Tailscale
27	service crashes, ACL misconfiguration, accidental `tailscale down` —
28	you can lose admin access entirely if you've also locked down public
29	sshd.
30
31	The split solves this: public :22 is always alive (firewalled to allow
32	SSH from anywhere) but restricted to git-shell, so an attacker who
33	probes :22 finds nothing but git commands. Admin :2200 lives on the
34	tailnet, where the firewall blocks public access and the surface area
35	is small.
36
37	## Break-glass procedure
38
39	If the tailnet path stops working (Tailscale down, key revoked, ACL
40	broken):
41
42	1. Verify public sshd is up from any machine:
43	`ssh -p 22 root@5.78.144.244 -o BatchMode=yes -o ConnectTimeout=5 true`
44	Expect a key-based prompt or a refused git-shell — both prove sshd
45	is listening. A timeout or "Connection refused" means public sshd is
46	ALSO down and you need Hetzner Cloud Console.
47	2. Edit `/etc/ssh/sshd_config.d/git-shell.conf` from Hetzner Console
48	to temporarily restore an interactive shell for the `root` user on
49	port `22`. Match `User root` block, set `ForceCommand` to nothing.
50	3. Restart sshd: `systemctl restart ssh`. Test from your laptop.
51	4. Fix the tailnet path (re-auth `tailscale up`, restore key, etc).
52	5. Revert the sshd edit and restart `ssh` again. Don't leave the
53	interactive root shell on public :22 — it defeats the whole split.
54
55	## What NOT to do
56
57	- Do not disable Tailscale SSH (`tailscale set --ssh=false`) without
58	first proving public :22 is reachable and you have a working root key
59	for it. Memory rule: `feedback_tailscale_ssh` — getting locked out
60	requires Hetzner Console access, which costs time we don't always
61	have.
62	- Do not restrict public :22 to specific IPs without coordinating —
63	the mnw-cli git endpoint serves users worldwide.
64	- Do not open port 2200 to the public — it's the admin shell.
65
66	## Hetzner Console (last resort)
67
68	If both paths are dead, the Hetzner Cloud Console provides KVM-style
69	access to the server's serial console. Login at
70	<https://console.hetzner.cloud>, select the project, select the server,
71	"Console" tab. Slow but always available. Same root credentials.
72
73	## Key paths
74
75	- Firewall: `deploy/setup-firewall.sh`
76	- Public sshd (git-only): `deploy/sshd-git.conf`, `deploy/setup-git-ssh.sh`
77	- Admin sshd (tailnet): `/etc/ssh/sshd_config` on the server
78	- Deploy entry point: `deploy/deploy.sh` (uses `-p 2200`)
79	- CI runner setup: `deploy/setup-ci.sh` (uses `-p 2200`)
80