Skip to main content

max / makenotwork

chore: move sub-project internal docs to private store 26 files across mnw-cli, multithreaded, pom, shared/* libraries, and wam: all todos, audits, competition analyses, mnw-cli cleanup, MT moderation policy, MT rollback ops doc, PoM runbook, docengine roadmap.
Author: Max J. <87768334+MaxJMath@users.noreply.github.com> · 2026-05-21 01:59 UTC
Commit: 8676b5d4f91b4786d35ab2f2c4f38a022162476b
Parent: 8f2cb75
25 files changed, +0 insertions, -2163 deletions
@@ -1,86 +0,0 @@
1 - # mnw-cli AI Anti-Pattern Cleanup
2 -
3 - Audit of mnw-cli (Rust SSH server with ratatui TUI) for silent error handling and AI-induced anti-patterns.
4 -
5 - **Summary:** 3 MEDIUM, 2 LOW. Zero HIGH. No dead code, no stubs, no string-typing, no `#[allow(dead_code)]`, no `todo!()`/`unimplemented!()`. One memory leak in render code.
6 -
7 - ## Fixes (MEDIUM)
8 -
9 - ### M1. Memory leak via `.leak()` in upload render
10 -
11 - `src/tui/upload.rs:169` — `staging::derive_title(&sf.filename).leak()` converts an owned `String` to `&'static str` by permanently leaking the allocation. Called on every render of the upload screen for every staged file without metadata. Since the result is immediately consumed by `title.to_string()` on line 188, the leak produces no benefit.
12 -
13 - **Fix:** Use owned `String` instead of `&str` reference to avoid the leak:
14 - ```rust
15 - let title = meta
16 - .and_then(|m| m.title.clone())
17 - .unwrap_or_else(|| staging::derive_title(&sf.filename));
18 - ```
19 -
20 - ### M2. Silent error response body loss in API client
21 -
22 - `src/api.rs:217,230` — `resp.text().await.unwrap_or_default()` in `json_response` and `empty_response`. If body extraction itself fails (connection reset mid-read, encoding issue), the error message becomes "HTTP 500" instead of "HTTP 500 — connection reset during body read". Same pattern as GO/BB L1.
23 -
24 - **Fix:** Replace with `.unwrap_or_else(|e| format!("[body read failed: {e}]"))`.
25 -
26 - ### M3. Silent staging file deletion failure after publish
27 -
28 - `src/tui/mod.rs:1956` — `tokio::fs::remove_file(file_path).await.ok()` after a successful publish. If deletion fails (permissions, file locked), the file stays in the staging directory, still counting against the staging quota. The user sees it reappear in the upload list and might accidentally re-publish (creating a duplicate item).
29 -
30 - **Fix:** Replace `.ok()` with `if let Err(e)` + `tracing::warn!` including the filename.
31 -
32 - ## Fixes (LOW)
33 -
34 - ### L1. Silent API data load failures in TUI
35 -
36 - `src/tui/mod.rs` — Eight `load_*` functions silently swallow API errors with `.unwrap_or_default()` or `.ok()`: `load_home_data` (line 1962-1975), `load_project_items` (1991), `load_staged_files` (2005), `load_blog_posts` (2053), `load_promo_codes` (2063), `load_license_keys` (2077), `load_transactions` (2107), `load_settings` (2114-2115). When the API call fails, the user sees empty data with no indication that loading failed. Note: `load_analytics` and `load_item_detail` already handle errors correctly by sending `GenericError`/`ItemActionError` payloads.
37 -
38 - **Fix:** Add `tracing::warn!` before the fallback in each function. Keep the `.unwrap_or_default()` behavior (showing empty is fine for a TUI).
39 -
40 - ### L2. Silent git_authorize error body loss
41 -
42 - `src/api.rs:863-868` — `resp.text().await.unwrap_or_default()` followed by JSON parse with `.ok()`. If the body read fails, the error becomes a generic "HTTP 403" instead of the actual authorization error message.
43 -
44 - **Fix:** Same as M2 — replace with `.unwrap_or_else(|e| format!("[body read failed: {e}]"))`.
45 -
46 - ## Skipping (intentional design)
47 -
48 - **SSH channel cleanup (handler.rs, 14 instances):** All `let _ = handle.data/close/eof/exit_status_request/extended_data` calls in exec_request, SCP error response, and command output. These are fire-and-forget SSH protocol sequences — if the channel is already closed (client disconnected), these fail harmlessly.
49 -
50 - **Git subprocess I/O (handler.rs:330, git.rs:106,121,132-133,136-137,140-142):** `let _ = stdin.write_all(data).await` pipes SSH input to git subprocess. Read errors in stdout/stderr forwarding loops break the loop (normal EOF). JoinHandle awaits are cleanup. Exit code `.unwrap_or(1)` handles signal kills (no exit code). All correct.
51 -
52 - **TUI event channel sends (~40 instances in tui/mod.rs):** All `let _ = tx.send(AppEvent::DataLoaded(...)).await` are channel sends from background tasks to the TUI event loop. If the receiver is dropped (TUI exiting), these fail, which is expected.
53 -
54 - **AppHandle channel sends (tui/mod.rs:111,115):** `let _ = self.tx.send(AppEvent::Input/Resize).await` — fire-and-forget input forwarding. If TUI is shutting down, benign.
55 -
56 - **Session close on quit (tui/mod.rs:364):** `let _ = session_handle.close(channel_id).await` — best-effort SSH session close when user presses `q`.
57 -
58 - **Terminal resize (tui/mod.rs:428):** `let _ = terminal.resize(rect)` — ratatui terminal resize. Failure means the next render uses the old size, which is harmless.
59 -
60 - **Initialization panics (main.rs:92, api.rs:252):** `.expect()` on SIGTERM handler registration and HTTP client construction. Correct — these are process-fatal startup conditions.
61 -
62 - **Ctrl+C fallback (main.rs:99):** `ctrl_c.await.ok()` — non-Unix signal handling fallback.
63 -
64 - **Git path parsing (git.rs:34,43,51):** `.unwrap_or(path)` / `.unwrap_or(repo_name)` for stripping quotes/prefix/suffix. Correct fallback — returns unmodified input.
65 -
66 - **User input parsing (tui/mod.rs:1581,2136-2141):** `.parse().unwrap_or(0)` on discount percentage and price input. Correct — invalid user input defaults to 0.
67 -
68 - **JSON serialization (commands.rs, multiple):** `.unwrap_or_default()` on `serde_json::to_vec_pretty()` — infallible for valid `Serialize` types.
69 -
70 - **Environment variable defaults (config.rs, 5 instances):** `.unwrap_or_else(|_| ...)` on env var reads with sensible defaults.
71 -
72 - **System time fallbacks (staging.rs:58,117):** `.unwrap_or(UNIX_EPOCH)` on metadata.modified(), `.unwrap_or_default()` on duration_since. Platform guarantees make failure impossible in practice.
73 -
74 - **Best-effort empty dir cleanup (staging.rs:130):** `let _ = fs::remove_dir(user_dir.path()).await` — removes empty user staging dirs during periodic cleanup. Failure is harmless.
75 -
76 - **Display-only formatting (all TUI render files):** `.unwrap_or("...")` / `.get(..10).unwrap_or(...)` on date truncation, tier labels, display names, etc. Pure display fallbacks with no state impact.
77 -
78 - **SFTP spawn (handler.rs:218-220):** `russh_sftp::server::run(stream, sftp_session).await` runs in a spawned task without error handling. Session cleanup is russh_sftp's responsibility.
79 -
80 - **RUSSH SFTP handler (sftp.rs):** Returns `StatusCode` errors to the SFTP client — correct protocol behavior, not silent swallowing.
81 -
82 - ## Verification
83 -
84 - ```sh
85 - cd ~/Code/MNW/mnw-cli && cargo check && cargo test
86 - ```
@@ -1,111 +0,0 @@
1 - # mnw-cli TODO
2 -
3 - ## Status
4 - Done: Phases 1-8, Git proxy A-D, UX audit (8/8), all remaining features. Deployed 2026-05-05. Active: None. Next: Post-beta features.
5 -
6 - ---
7 -
8 - ## Deferred (Post-Beta)
9 -
10 - - [ ] TUI tag management screen (currently inline on item detail only)
11 - - [ ] TUI tier creation (currently read-only — tier creation requires Stripe)
12 - - [ ] Collection create/delete from TUI (API wired, TUI screen not built)
13 -
14 - ---
15 -
16 - ## OTA Publish Subcommand (Post-Beta)
17 -
18 - Replace `MNW/server/deploy/ota-publish.sh` with a typed Rust implementation. The bash script shells out to `curl` + `python3 -c "import json"` per artifact, fragile and not testable. Three apps (GO, BB, AF) will each publish ~9 artifacts per release; that's a lot of glue to maintain across repos.
19 -
20 - ### Goal
21 -
22 - One command:
23 - ```
24 - mnw ota publish --app goingson --version 0.3.1 [--dist ./dist] [--notes "..."]
25 - ```
26 - Discovers artifacts in the dist directory, reads Tauri signatures from `latest.json`, authenticates, and publishes every platform/arch artifact for the given version.
27 -
28 - ### Architecture
29 -
30 - **Layer 1 — `synckit-client::ota` module** (`MNW/shared/synckit-client/src/client/ota.rs`)
31 -
32 - Typed client wrapping the server's existing OTA endpoints. ~200 LOC + tests.
33 -
34 - ```rust
35 - pub struct OtaClient { /* reuses synckit auth + http */ }
36 -
37 - pub struct ReleaseManifest {
38 - pub app_slug: String,
39 - pub version: String,
40 - pub notes: Option<String>,
41 - pub artifacts: Vec<ArtifactEntry>,
42 - }
43 -
44 - pub struct ArtifactEntry {
45 - pub target: Target, // Linux | Darwin | Windows
46 - pub arch: Arch, // X86_64 | Aarch64
47 - pub path: PathBuf,
48 - pub signature: Option<String>, // Tauri minisign
49 - }
50 -
51 - impl OtaClient {
52 - pub async fn create_release(&self, app_id, version, notes, signature) -> Result<ReleaseId>;
53 - pub async fn register_artifact(&self, release_id, target, arch, size) -> Result<UploadUrl>;
54 - pub async fn upload(&self, upload_url, bytes) -> Result<()>;
55 - pub async fn verify_updater(&self, slug, target, arch) -> Result<bool>;
56 - /// High-level: takes a manifest, does the full publish loop with retry/resume.
57 - pub async fn publish(&self, manifest: ReleaseManifest) -> Result<PublishReport>;
58 - }
59 - ```
60 -
61 - Tests mock the HTTP layer (existing pattern in `client/sync.rs`). Cross-reference: `synckit-client/docs/todo.md`.
62 -
63 - **Layer 2 — `mnw-cli ota publish` subcommand** (`MNW/mnw-cli/src/commands.rs` + new `src/ota.rs`)
64 -
65 - ~150 LOC. Responsibilities:
66 -
67 - - Argument parsing (`--app`, `--version`, `--dist`, `--notes`, `--dry-run`)
68 - - **Artifact discovery** by convention:
69 - - `dist/{App}_{version}_x64.msi`, `_x64-setup.exe`, `_aarch64.dmg`, `_x86_64.AppImage`, `_aarch64.AppImage`, `_amd64.deb`, etc.
70 - - Or read from a `release.toml` manifest in the app repo (preferred for explicitness)
71 - - **Signature extraction** from Tauri's `latest.json` per platform (when present)
72 - - **Auth** via OS keyring (already used elsewhere in mnw-cli) with env-var fallback (`MNW_OTA_EMAIL/PASSWORD/API_KEY`)
73 - - **Progress UI** — line per artifact: `goingson 0.3.1 linux/x86_64 [====> ] 42 MB / 78 MB`
74 - - **Dry-run mode** — list what would be published, no network calls
75 - - **Idempotency** — re-running after a partial failure should skip already-uploaded artifacts (server returns 409 on duplicate artifact registration; treat as success)
76 -
77 - **Layer 3 — Per-app integration** (optional)
78 -
79 - Each app's `release.sh` (or a `cargo xtask release`) calls `mnw ota publish --app <slug> --version $(grep version Cargo.toml)`. No per-app Rust code needed.
80 -
81 - ### Cutover plan
82 -
83 - 1. Build `synckit-client::ota` with full test coverage. Server endpoints already exist — no server changes needed.
84 - 2. Build `mnw-cli ota publish` subcommand.
85 - 3. Verify against staging: publish a test release for `goingson` 0.3.1, check it appears in updater endpoint.
86 - 4. Delete `MNW/server/deploy/ota-publish.sh`.
87 - 5. Update each app's `docs/deploy.md` to reference `mnw ota publish`.
88 -
89 - ### Trigger / priority
90 -
91 - **Not blocking soft launch** — testers download from `~/Dist` directly during beta; no OTA updates planned in the first week. Build this when GO needs its first post-launch update, which is also when bugs surfaced by testers need to ship to them quickly.
92 -
93 - ### Estimate
94 -
95 - ~1 focused day total:
96 - - `synckit-client::ota` module + tests: ~4 hours
97 - - `mnw-cli ota publish` subcommand + discovery + progress UI: ~3 hours
98 - - Cutover + per-app docs update: ~1 hour
99 -
100 - ---
101 -
102 - ## Key Paths
103 - ```
104 - mnw-cli/src/
105 - main.rs, config.rs, api.rs, commands.rs, format.rs, staging.rs
106 - ssh/ (mod.rs, handler.rs, terminal.rs, sftp.rs, git.rs)
107 - tui/ (mod.rs, input.rs, loading.rs, home.rs, project.rs, upload.rs,
108 - item.rs, analytics.rs, blog.rs, promo.rs, keys.rs, settings.rs, widgets.rs)
109 - mnw-cli/deploy/ (deploy.sh, mnw-cli.service)
110 - docs/mnw/server/cli.md (design doc)
111 - ```
@@ -1,146 +0,0 @@
1 - # Rollback Guide — Multithreaded
2 -
3 - ## Quick Rollback (Re-deploy Previous Binary)
4 -
5 - MT is deployed to two targets: Hetzner (production) and Astra (staging). Rollback means re-building a previous commit.
6 -
7 - ### Steps (Hetzner — production)
8 -
9 - 1. **Identify the last known-good commit:**
10 - ```bash
11 - cd MNW/multithreaded
12 - git log --oneline -10
13 - ```
14 -
15 - 2. **Check out and build the previous version:**
16 - ```bash
17 - git checkout <commit-hash>
18 - cargo zigbuild --release --target x86_64-unknown-linux-gnu
19 - ```
20 -
21 - 3. **Deploy the rollback binary:**
22 - ```bash
23 - ssh root@100.120.174.96 "systemctl stop multithreaded || true"
24 - scp target/x86_64-unknown-linux-gnu/release/multithreaded root@100.120.174.96:/opt/multithreaded/multithreaded
25 - ssh root@100.120.174.96 "chmod +x /opt/multithreaded/multithreaded && chown multithreaded:multithreaded /opt/multithreaded/multithreaded"
26 - ssh root@100.120.174.96 "systemctl start multithreaded"
27 - ```
28 -
29 - 4. **Verify:**
30 - ```bash
31 - ssh root@100.120.174.96 "systemctl status multithreaded --no-pager"
32 - ssh root@100.120.174.96 "curl -s -o /dev/null -w 'HTTP %{http_code}\n' http://127.0.0.1:3400"
33 - ```
34 -
35 - 5. **Return to main branch:**
36 - ```bash
37 - git checkout main
38 - ```
39 -
40 - ### Steps (Astra — staging)
41 -
42 - Astra builds natively (aarch64). The deploy script rsyncs source and builds on Astra.
43 -
44 - 1. **Check out the previous commit locally:**
45 - ```bash
46 - git checkout <commit-hash>
47 - ```
48 -
49 - 2. **Re-deploy using the normal deploy script:**
50 - ```bash
51 - ./deploy/deploy.sh
52 - ```
53 - This rsyncs the checked-out source to Astra, builds there, and restarts.
54 -
55 - 3. **Return to main branch:**
56 - ```bash
57 - git checkout main
58 - ```
59 -
60 - ## Emergency Stop
61 -
62 - ```bash
63 - # Hetzner (production)
64 - ssh root@100.120.174.96 "systemctl stop multithreaded"
65 -
66 - # Astra (staging)
67 - ssh max@100.106.221.39 "sudo systemctl stop multithreaded"
68 - ```
69 -
70 - To restart:
71 - ```bash
72 - ssh root@100.120.174.96 "systemctl start multithreaded"
73 - ssh max@100.106.221.39 "sudo systemctl start multithreaded"
74 - ```
75 -
76 - ## Database Restore
77 -
78 - MT does not have automated backups. For a manual backup/restore:
79 -
80 - ### Create a manual backup
81 -
82 - ```bash
83 - # Hetzner
84 - ssh root@100.120.174.96 "sudo -u multithreaded pg_dump multithreaded | gzip > /opt/multithreaded/mt-backup-$(date +%Y%m%d).sql.gz"
85 -
86 - # Astra
87 - ssh max@100.106.221.39 "pg_dump multithreaded | gzip > /tmp/mt-backup-$(date +%Y%m%d).sql.gz"
88 - ```
89 -
90 - ### Restore from backup
91 -
92 - 1. **Stop the application:**
93 - ```bash
94 - ssh root@100.120.174.96 "systemctl stop multithreaded"
95 - ```
96 -
97 - 2. **Back up current state:**
98 - ```bash
99 - ssh root@100.120.174.96 "sudo -u multithreaded pg_dump multithreaded | gzip > /opt/multithreaded/mt-pre-restore.sql.gz"
100 - ```
101 -
102 - 3. **Drop and recreate:**
103 - ```bash
104 - ssh root@100.120.174.96 "sudo -u postgres psql -c 'DROP DATABASE multithreaded;'"
105 - ssh root@100.120.174.96 "sudo -u postgres psql -c \"CREATE DATABASE multithreaded OWNER multithreaded;\""
106 - ```
107 -
108 - 4. **Restore:**
109 - ```bash
110 - ssh root@100.120.174.96 "gunzip -c /opt/multithreaded/mt-backup-YYYYMMDD.sql.gz | sudo -u multithreaded psql multithreaded"
111 - ```
112 -
113 - 5. **Restart** (migrations auto-apply on boot):
114 - ```bash
115 - ssh root@100.120.174.96 "systemctl start multithreaded"
116 - ```
117 -
118 - ## Service Architecture Reference
119 -
120 - ### Hetzner (production)
121 -
122 - - **Binary**: `/opt/multithreaded/multithreaded`
123 - - **Config**: `/opt/multithreaded/.env`
124 - - **Static**: `/opt/multithreaded/static/`
125 - - **Migrations**: `/opt/multithreaded/migrations/`
126 - - **Systemd unit**: `/etc/systemd/system/multithreaded.service`
127 - - **Logs**: `journalctl -u multithreaded -f`
128 - - **Port**: 127.0.0.1:3400 (Caddy reverse proxies `forums.makenot.work`)
129 - - **DB**: PostgreSQL `multithreaded` database, `multithreaded` user (peer auth)
130 - - **Domain**: `forums.makenot.work` (Cloudflare-proxied)
131 -
132 - ### Astra (staging)
133 -
134 - - **Binary**: `/opt/multithreaded/multithreaded`
135 - - **Config**: `/opt/multithreaded/.env`
136 - - **Source**: `~/src/multithreaded/` (rsynced from local)
137 - - **Shared deps**: `~/src/shared/` (docengine, tagtree, s3-storage)
138 - - **Port**: 0.0.0.0:3400 (direct access via Tailscale)
139 - - **DB**: PostgreSQL `multithreaded` database
140 -
141 - ### Common
142 -
143 - - **Restart policy**: `Restart=always`, `RestartSec=5`
144 - - **Depends on**: `postgresql.service`
145 - - **Migrations**: auto-applied on boot (`sqlx::migrate!()`)
146 - - **Memory limit**: 512M (`MemoryMax=512M`)
@@ -1,80 +0,0 @@
1 - # Multithreaded -- Audit History
2 -
3 - Full chronological audit log. See [audit_review.md](./audit_review.md) for current state.
4 -
5 - ## Changes Since Last Audit
6 -
7 - ### Tenth formal audit (2026-04-30, Run 17 cross-project)
8 - - **Test count:** 228 (58 unit + 170 integration). 0 clippy warnings. 0 failures.
9 - - **Grade:** A (maintained). v0.3.4. ~9,928 LOC.
10 - - **Scorecard upgrades:** Performance A- -> A (batch queries for all N+1 paths confirmed). Documentation B+ -> A- (module-level //! docs on all significant files). Observability B+ -> A (#[instrument] on every handler and DB function). Codebase Size B+ -> A (9,928 LOC lean for full forum). Migration Safety A+ -> A- (early migrations 001-007 lack IF NOT EXISTS).
11 - - **Cold spots (2):** seed.rs (1,032 LOC, B+) and early migrations 001-007 (lack IF NOT EXISTS, B+).
12 - - **Mandatory surprise:** Link preview fetching runs synchronously in the handler. Posts with multiple slow URLs could block for up to 15s. Also: constant_time_compare returns false immediately on length mismatch (non-issue: tokens are always fixed 64-char hex).
13 - - **All 13 previously resolved items verified intact.** No regressions.
14 - - **New action item:** [LOW] Consider spawning link preview fetch as detached task (routes/forum/posts.rs).
15 -
16 - ### Seventh formal audit (2026-03-28, Run 12 cross-project)
17 - - **Test count:** 225 (35 unit lib + 190 integration). 0 clippy warnings. 0 failures.
18 - - **Grade:** A (maintained). v0.3.2.
19 - - **Internal API improvements:** MNW category auto-provisioning (Items, Blog, Devlog, Discussion) via internal API with shared secret auth.
20 - - **Link preview fix:** Corrected URL extraction edge case.
21 - - **New dependency advisories (action items):**
22 - - aws-lc-sys 0.38.0 (RUSTSEC-2026-0044 + -0048, severity 7.4 HIGH) — upgrade to 0.39.0 via `cargo update -p aws-lc-sys`
23 - - rustls-webpki 0.103.9 (RUSTSEC-2026-0049) — upgrade to 0.103.10 via `cargo update -p rustls-webpki`
24 - - **Mandatory surprise:** None new. Previous surprises (CoreError dead code, link_preview IPv6 blocking) both resolved.
25 - - **No new code findings.** All previous items remain resolved.
26 - - **Note:** Test count 225 is lower than previous 249 — mt-core (16) and mt-db (11) unit tests may not have been captured in this run. Integration tests grew from 187 to 190.
27 -
28 - ### Test coverage expansion (2026-03-22)
29 - - **Test count:** 222 -> 249 (+27 tests). 0 clippy warnings.
30 - - **Grade:** A (maintained). Testing A- -> A. Three cold spots resolved.
31 - - **auth.rs:** 3 -> 8 integration tests (+5). PKCE params, state nonce validation (3 paths), suspended user behavior.
32 - - **admin.rs:** 6 -> 10 integration tests (+4). Search, invalid UUID handling, mod_log entry creation, non-admin access denial.
33 - - **mutations.rs:** New test file with 18 integration tests. Covers: cleanup_expired_bans, ban upserts, swap_category_order, get_category_id_by_slugs, update_category, ensure_membership idempotency, soft_delete, create_post activity bump, toggle_endorsement, insert_flag idempotency, remove_image, link_preview dedup, mentions dedup, upsert_user.
34 - - **seed.rs:** Type safety improved — raw `&str` role params replaced with `CommunityRole` enum (B -> A-).
35 - - **Module heatmap updates:** auth.rs Test B- -> A-, admin.rs Test B -> A-, mutations.rs Test B -> A-, seed.rs Code B+ -> A- / Type Safety B -> A-.
36 -
37 - ### Fifth formal audit (2026-03-18, Run 9 cross-project)
38 - - **Test count:** 222 (unchanged). 0 clippy warnings.
39 - - **Grade:** A (maintained). v0.3.1 (deployed 2026-03-18).
40 - - **No new findings requiring action.**
41 - - **Observations (pre-existing, not regressions):**
42 - - ~~`deletion_task.abort()` in main.rs without awaiting completion~~ — Fixed: now awaits task completion after abort.
43 - - Inline `onsubmit` confirmation dialogs in thread.html — not screen-reader friendly. Impact: LOW, functional but not best-practice.
44 - - ~~No client-side maxlength on textarea inputs~~ — Fixed: maxlength added to all inputs/textareas. Server-side limits added for flag detail and ban/mute reason (1024 bytes).
45 - - **Mandatory surprise:** URL validation in link_preview.rs blocks IPv4-mapped IPv6 addresses via host_part parsing, but IPv6 full range check uses string prefix match for unique local addresses. Intentionally restrictive (good for SSRF) — not a vulnerability.
46 -
47 - ### Phases 19 + 20 implementation (2026-03-16)
48 - - **Test count:** 146 -> 173 (+27 tests: 19 unit + 7 integration + 1 workflow mod)
49 - - **Grade:** A (maintained). Phases 19 (@Mentions) and 20 (Link Previews) implemented.
50 - - **Source LOC:** ~7,000 (up from 6,232)
51 - - **Migrations:** 12 -> 17 (013 flagging, 014 tags, 015 tracking, 016 post_mentions, 017 link_previews)
52 - - **New files:** `src/link_preview.rs` (URL extraction + OG fetch), `tests/workflows/mentions.rs` (4 tests), `tests/workflows/link_previews.rs` (3 tests)
53 - - **New DB functions:** `resolve_usernames_in_community`, `insert_mentions`, `list_link_previews_for_posts`, `insert_link_preview`
54 - - **Markdown:** `extract_mention_usernames`, `resolve_mentions` with code-span awareness
55 - - **Zero clippy warnings, all 173 tests passing.**
56 -
57 - ### Second formal audit (2026-03-16, Run 6 cross-project)
58 - - **Test count:** 106 -> 146 (+40 tests)
59 - - **Grade:** A (maintained). Phases 14, 15, and 21 implemented since last audit.
60 - - **Source LOC:** 6,232 (up from ~4,800)
61 - - **Migrations:** 10 -> 12 (post_footnotes, post_endorsements)
62 - - **Instrument coverage:** 109/110 (99%) — near-perfect
63 - - **New finding (LOW):** Regex compiled per-request in verify_quotes/post_process_quotes for SHA-256 hash pattern matching. Should use LazyLock.
64 - - **Performance note:** forum.rs at 969 LOC split into forum/ directory module: views.rs (510) + actions.rs (480).
65 - - **Mandatory surprise:** Per-request regex in quote verification — LOW (functional but inefficient).
66 - - **Previous items verified:** All previous remediated items confirmed intact.
67 -
68 - ### First formal audit (2026-03-14)
69 - - **Grade:** B+ (unchanged from baseline, but now backed by per-module code review)
70 - - **Baseline was optimistic on:** Security (A- -> B+: javascript: XSS found, fail-open patterns found), Type Safety (A- -> B+: domain types confirmed unused), Observability (B -> C: zero #[instrument] is worse than "no annotations yet"), Performance (B -> A-: indexes are actually solid)
71 - - **Baseline was pessimistic on:** Performance (B -> A-: proper composite indexes, partial indexes, no N+1)
72 - - **Test count confirmed:** 90 (documented 72 was wrong: 56 integration + 18 unit markdown/csrf + 16 unit mt-core)
73 - - **New findings:** 1 HIGH (javascript: XSS), 4 MEDIUM (secure cookie, transaction, fail-open, observability), 5 SMALL
74 -
75 - ### Full remediation (2026-03-14)
76 - - **Grade:** B+ -> A- (all 10 findings resolved, grade capped by git hygiene)
77 - - **Tests:** 90 -> 97 (+7 markdown security tests)
78 - - **Files:** 36 -> 33 (deleted error.rs, models.rs, pool.rs)
79 - - **Cold spots:** 7 -> 3 (resolved: markdown XSS, observability, dead code, dead docs x2)
80 - - **Key changes:** URL scheme allowlist sanitization, 86 `#[instrument(skip_all)]`, fail-closed access checks, transaction wrapping, configurable Secure cookie, dead code + deps removed, mod log error logging, `.env.example` expanded
@@ -1,129 +0,0 @@
1 - # Multithreaded -- Code Audit Review
2 -
3 - **Last audited:** 2026-04-30 (tenth audit, Run 17 cross-project)
4 - **Previous audit:** 2026-04-18 (ninth audit, Run 15 cross-project)
5 -
6 - ## Overall Grade: A
7 -
8 - Run 17 cross-project audit (remediated 2026-05-01). 228 tests (58 unit + 170 integration, all pass). 0 clippy warnings. v0.3.4. ~9,928 LOC. All 13 previously resolved items verified intact, no regressions. Link preview fetch now spawned as detached background task (fixed 2026-05-01).
9 -
10 - ## Scorecard
11 -
12 - | Dimension | Grade | Notes |
13 - |-----------|:-----:|-------|
14 - | Code Quality | A | Zero clippy warnings. Consistent `map_err` + `tracing::error!` error handling. Mod log failures logged. |
15 - | Architecture | A+ | Clean 3-crate workspace: mt-core (time formatting), mt-db (queries/mutations), main app (routes, auth, templates). Route module properly split. Template layer uses view-model structs. |
16 - | Testing | A | 228 tests (58 unit + 170 integration, all pass) at ~23 tests/KLOC. Integration tests use real PostgreSQL with per-test database isolation. Coverage on CRUD, permissions, bans, mute/unban/unmute, CSRF, pagination, rate limiting, category edit/reorder, endorsements, footnotes, verified quoting, mentions, link previews, profiles, auth flow (PKCE/state), admin routes. |
17 - | Security | A | All SQL parameterized. CSRF with constant-time comparison. OAuth PKCE with state nonce. Markdown via docengine (URL scheme allowlist + HTML sanitization). Fail-closed access checks. Link previews only store title/description, not og:image. |
18 - | Performance | A | Proper indexes on all query patterns. Partial index on ban expiration. Batch queries for all N+1 paths confirmed. Per-IP rate limiting. |
19 - | Documentation | A- | Module-level `//!` docs on all significant files. `.env.example` documents all env vars. Some view-model structs lack doc comments. |
20 - | Dependencies | A | Minimal deps, all justified. Rust 2024 edition. Dead deps removed. Workspace dependency management. |
21 - | Frontend | A | HTMX for dynamic interactions. Askama autoescaping. CSRF auto-injected. Toast uses `textContent`. `body_html` sanitized by docengine. |
22 - | Type Safety | A+ | Query layer uses focused `FromRow` projections. Clean domain types. |
23 - | Observability | A | `#[instrument]` on every handler and every DB function. `tracing-subscriber` with EnvFilter. |
24 - | Concurrency | A | Async throughout with tokio. Graceful shutdown. reqwest timeouts. `swap_category_order` uses transaction. Per-IP rate limiting. |
25 - | Resilience | A | Graceful shutdown. HTTP client timeouts. Error logging without panics. Rate limiting. MNW API calls retry with backoff on network/5xx errors. |
26 - | API Consistency | A | Consistent redirect-with-toast pattern. Proper status codes (403/404/422). Health endpoint returns JSON. |
27 - | Migration Safety | A- | SQLx `migrate!()` with sequential numbering. Later migrations use IF NOT EXISTS. Early migrations (001-007) lack it. Protected by sqlx tracking. |
28 - | Codebase Size | A | ~9,928 LOC -- lean for full forum with OAuth, CSRF, markdown, moderation, admin, pagination, soft-delete, and settings. |
29 -
30 - ## Module Heatmap
31 -
32 - | Module | Code | Arch | Test | Security | Perf | Docs | Type Safety | Observability | Concurrency | Resilience |
33 - |--------|:----:|:----:|:----:|:--------:|:----:|:----:|:-----------:|:-------------:|:-----------:|:----------:|
34 - | main.rs | A- | A | - | A- | A | A- | A | A | A | A |
35 - | config.rs | A | A | - | A | - | A | A | - | - | - |
36 - | auth.rs | A- | A | A- | A- | A | A | A | A | A | B+ |
37 - | csrf.rs | A | A | A | A+ | A | A | A | A | - | - |
38 - | (docengine) | A | A | A | A+ | A | A | A | - | - | - |
39 - | seed.rs | A- | A | - | A | - | A | A- | - | - | - |
40 - | routes/mod.rs | A | A | - | A- | A | A | A | A | A- | A- |
41 - | routes/forum.rs | A | A | B+ | A- | A | A | A | A | A | A- |
42 - | routes/moderation.rs | A | A | A- | A | A | A | A | A | A | A- |
43 - | routes/settings.rs | A | A | A- | A | A | A | A | A | A- | A- |
44 - | routes/admin.rs | A | A | A- | A | A | A | A | A | A | A |
45 - | templates/ | A | A | A | A- | - | A | A | - | - | - |
46 - | mt-core/time_format.rs | A | A | A | - | A | A | A | - | - | - |
47 - | mt-db/queries.rs | A | A | B+ | A+ | A | A | A | A | A | A |
48 - | mt-db/mutations.rs | A | A | A- | A+ | A | A | A | A | A- | A |
49 -
50 - ### Cold Spots
51 -
52 - 1. **seed.rs (1,032 LOC) -- B+**: Large for seed data. Content is substantive but could externalize to TOML/JSON.
53 - 2. **Early migrations (001-007) -- B+**: Lack IF NOT EXISTS guards. Mitigated by sqlx migration tracking.
54 -
55 - ## Strengths
56 -
57 - - **Clean architecture.** 3-crate workspace with proper separation. Route module split into focused files. Template layer uses view-model structs.
58 - - **Comprehensive CSRF.** Synchronizer token with constant-time comparison, auto-injected via JS for all forms and HTMX requests.
59 - - **Solid test infrastructure.** Full Axum app with real PostgreSQL per test. Cookie-aware client with automatic CSRF token extraction. 228 tests all passing.
60 - - **Authorization hierarchy.** Owner > mod > member correctly enforced. Owners cannot be banned. Only owners can ban mods.
61 - - **Input validation.** Length limits on all user content. Slug format validation. UUID parsing validated. Sort/order whitelisted.
62 - - **SQL safety.** All 40+ queries parameterized. Dynamic ORDER BY uses whitelist match.
63 -
64 - ## Weaknesses
65 -
66 - - ~~**No retry on MNW API calls.**~~ Already implemented (auth.rs:210-316). Both token exchange and userinfo retry with 500ms/1000ms backoff.
67 - - ~~**Health endpoint doesn't verify DB.**~~ Fixed 2026-04-22. Now runs `SELECT 1` and reports degraded status on failure.
68 - - ~~**OG image URL not validated.**~~ Incorrect finding. fetch_og_metadata only returns og:title and og:description. No image URLs stored or rendered.
69 -
70 - ## Mandatory Surprise
71 -
72 - **Link preview fetching blocks post creation**
73 -
74 - ~~Link preview fetching runs synchronously in the handler.~~ Fixed 2026-05-01. `spawn_link_preview_fetch` now runs as a detached `tokio::spawn` task. Post creation returns immediately; previews appear asynchronously.
75 -
76 - Also notable: The `constant_time_compare` in csrf.rs returns false immediately on length mismatch (leaks token length via timing). In practice this is a non-issue because CSRF tokens are always 64-char hex (fixed length).
77 -
78 - ### Previous Surprises
79 -
80 - - **Run 15:** ~~Link preview image URL vulnerability~~ -- Incorrect finding. Code never fetches og:image.
81 - - **Run 12:** ~~`CoreError` dead code~~ -- Resolved. Dead code removed.
82 -
83 - ## Action Items
84 -
85 - Filed in `docs/mnw/mt/todo.md`.
86 -
87 - ### All resolved (previous audits)
88 - 1. ~~**[HIGH]** Sanitize URL schemes in markdown rendering~~ -- Done.
89 - 2. ~~**[MEDIUM]** Add `#[instrument(skip_all)]` to all route handlers and DB functions~~ -- Done.
90 - 3. ~~**[MEDIUM]** Make session cookie `Secure` flag configurable~~ -- Done.
91 - 4. ~~**[MEDIUM]** Wrap `swap_category_order` in transaction~~ -- Done.
92 - 5. ~~**[MEDIUM]** Change fail-open access checks to fail-closed~~ -- Done.
93 - 6. ~~**[SMALL]** Add `//!` module docs~~ -- Done.
94 - 7. ~~**[SMALL]** Remove dead code~~ -- Done.
95 - 8. ~~**[SMALL]** Log mod log insert failures~~ -- Done.
96 - 9. ~~**[SMALL]** Expand `.env.example`~~ -- Done.
97 - 10. ~~**[SMALL]** Initial git commit + configure remotes~~ -- Done.
98 -
99 - ### Run 15 (2026-04-18, corrected 2026-04-22)
100 - 11. ~~**[MEDIUM]** Validate og:image URL scheme in link preview~~ -- Incorrect finding. Code never fetches og:image.
101 - 12. ~~**[MEDIUM]** Add DB connectivity check to health endpoint~~ -- Fixed 2026-04-22.
102 - 13. ~~**[LOW]** Add retry logic to MNW OAuth API calls~~ -- Already implemented (auth.rs:210-316).
103 -
104 - ### Run 17 (2026-04-30)
105 - 14. ~~**[LOW]** Spawn link preview fetch as detached task~~ -- Done 2026-05-01.
106 -
107 - ## Metrics Over Time
108 -
109 - | Audit Date | LOC | Rust Files | Tests | Tests/KLOC | Clippy Warnings | Cold Spots | Overall |
110 - |------------|-----|-----------|-------|-----------|----------------|------------|---------|
111 - | 2026-03-14 | 4,808 | 36 | 90 | 18.7 | 0 | 7 | B+ |
112 - | 2026-03-14 (remediation) | ~4,600 | 33 | 97 | ~21 | 0 | 3 | A- |
113 - | 2026-03-14 (rate limit) | ~4,700 | 34 | 99 | ~21 | 0 | 3 | A- |
114 - | 2026-03-14 (coverage) | ~4,800 | 34 | 106 | ~22 | 0 | 1 | A |
115 - | 2026-03-14 (ammonia) | ~4,800 | 34 | 106 | ~22 | 0 | 0 | A |
116 - | 2026-03-16 (Run 6) | 6,232 | ~36| 146 | ~23 | 0 | 0 | A |
117 - | 2026-03-16 (P19+P20) | ~7,000 | ~38| 173 | ~25 | 0 | 0 | A |
118 - | 2026-03-17 (Run 8) | ~7,000 | ~38| 222 | ~32 | 0 | 0 | A |
119 - | 2026-03-18 (Run 9) | ~7,000 | ~38| 222 | ~32 | 0 | 0 | A |
120 - | 2026-03-22 (coverage) | ~7,000 | ~39| 249 | ~36 | 0 | 0 | A |
121 - | 2026-03-28 (Run 12) | ~7,200 | ~39| 225+ | ~32 | 0 | 0 | A |
122 - | 2026-04-15 (Run 14) | ~9,752 | -- | 231 | ~24 | 0 | 0 | A |
123 - | 2026-04-18 (Run 15) | ~9,752 | -- | 231 | ~24 | 0 | 3 | A- |
124 - | 2026-04-22 (Run 15 corrected) | ~9,752 | -- | 232 | ~24 | 0 | 0 | A |
125 - | 2026-04-30 (Run 17) | ~9,928 | -- | 228 | ~23 | 0 | 2 | A |
126 -
127 - ---
128 -
129 - See [audit_history.md](./audit_history.md) for full chronological audit log.
@@ -1,81 +0,0 @@
1 - # Multithreaded -- Competitive Analysis
2 -
3 - Last updated: 2026-04-02
4 -
5 - ## Positioning
6 -
7 - Multithreaded is forum software designed to be embedded within a creator platform. Each MNW project gets a community forum with zero configuration -- users authenticate via MNW OAuth (PKCE), and moderation integrates with the platform's existing trust system. This is not a standalone forum; it's a community layer for creators who already use Makenotwork.
8 -
9 - The key differentiator is integration depth: no separate auth, no separate user management, no separate moderation tools. A creator enables a forum for their project and it works immediately with their existing audience.
10 -
11 - ## Pricing Comparison
12 -
13 - | App | Price | Model |
14 - |-----|-------|-------|
15 - | **Multithreaded** | Included with MNW | Part of creator platform |
16 - | Discourse | $50-$300/mo (hosted) or self-host | Open source (GPL) |
17 - | Circle | $49-$399/mo | SaaS |
18 - | Mighty Networks | $41-$360/mo | SaaS |
19 - | Lemmy | Free (self-host) | Open source (AGPL) |
20 - | Flarum | Free (self-host) | Open source (MIT) |
21 - | XenForo | $160 license + $55/yr | Proprietary |
22 -
23 - ## Feature Matrix
24 -
25 - | Feature | MT | Discourse | Circle | Lemmy | Flarum |
26 - |---------|:--:|:---------:|:------:|:-----:|:------:|
27 - | Platform-integrated auth | Y | N | N | N | N |
28 - | Creator project communities | Y | N | N | N | N |
29 - | Categories | Y | Y | Y | Y | Y |
30 - | Threads + replies | Y | Y | Y | Y | Y |
31 - | Role-based moderation | Y | Y | Y | Y | Y |
32 - | Soft deletes + audit trail | Y | Y | N | N | N |
33 - | SSO/OAuth | Y (native) | Y (plugin) | Y | N | Y (ext) |
34 - | Federation | N | N | N | Y | N |
35 - | Real-time | N | Y | Y | N | N |
36 - | Email notifications | N | Y | Y | Y | Y |
37 - | Mobile app | N | Y | Y | Y | N |
38 - | Plugins/extensions | N | Y | N | N | Y |
39 - | Self-hostable | Y | Y | N | Y | Y |
40 - | Markdown rendering | Y | Y | N | Y | Y |
41 - | @mentions | Y | Y | Y | Y | N |
42 -
43 - ## Competitor Deep Dives
44 -
45 - ### 1. Discourse
46 -
47 - Industry-standard open-source forum software. Feature-rich (badges, trust levels, SSO, plugins, real-time). Hosted plans start at $50/mo. Self-hosting requires significant infrastructure. Overkill for small creator communities.
48 -
49 - **What MT lacks:** real-time updates, email notifications, trust levels, badges, plugins, mobile app, search, private messaging.
50 -
51 - ### 2. Circle
52 -
53 - Community platform with membership gating, Stripe integration, courses, events. Hosted-only SaaS ($49-$399/mo). Targets creators and course builders.
54 -
55 - **What MT lacks:** events, courses, membership tiers in the forum itself, mobile app. **What Circle lacks:** self-hosting, source availability, per-project granularity, zero-config creator integration.
56 -
57 - ### 3. Lemmy
58 -
59 - Federated link aggregator (Reddit alternative) built in Rust. ActivityPub protocol for cross-instance communication. Community-run, no monetization.
60 -
61 - **What MT lacks:** federation, voting/karma, link aggregation. **What Lemmy lacks:** creator platform integration, OAuth from parent platform, per-project communities.
62 -
63 - ### 4. Flarum
64 -
65 - Lightweight, modern forum software (PHP). Extensible via community packages. Free and open source. Clean UI but limited moderation tools in core.
66 -
67 - **What MT lacks:** extension system, search, rich text editor. **What Flarum lacks:** integrated auth, creator platform awareness, Rust performance.
68 -
69 - ## What We Offer That Competitors Don't
70 -
71 - - **Zero-config community per project** -- creator enables a forum and it works with their existing MNW audience
72 - - **Native platform authentication** -- MNW OAuth PKCE flow, no separate accounts or passwords
73 - - **Integrated moderation** -- platform-level suspensions cascade to forums; no separate admin panel
74 - - **Lightweight deployment** -- single Rust binary, systemd unit, no Docker or external services
75 - - **Source-available** -- PolyForm Noncommercial 1.0.0
76 -
77 - ## Target Users
78 -
79 - - MNW creators who want discussion forums for their projects
80 - - Open-source maintainers using MNW for distribution who want community feedback
81 - - Small communities that don't need the complexity of Discourse or the cost of Circle
@@ -1,104 +0,0 @@
1 - # Community Moderation Policy
2 -
3 - Internal policy document. Public-facing version TBD.
4 -
5 - ## Principle
6 -
7 - MNW communities are creator-moderated. MNW is not the content police. But MNW does enforce a quality floor: communities that are not maintained will be addressed, because neglected spaces harm fans and reflect on the platform.
8 -
9 - The goal is always to give the creator a real path forward, not to punish.
10 -
11 - ## Fan+ Feature Gating
12 -
13 - Storage-heavy forum features are gated behind Fan+ ($8/mo):
14 - - Signatures (text + image)
15 - - Custom / larger profile images
16 - - Image and video embeds in posts
17 - - Access to private communities
18 - - The + badge (Fan+ exclusive — creators do not get this)
19 -
20 - Creators automatically receive all other Fan+ perks in their own communities at no additional cost (they already pay $10-60/mo for the platform). This keeps rich media behind a paywall, reduces spam and low-effort posting, and ensures the people driving storage costs are contributing revenue.
21 -
22 - Free accounts can: read everything, post text, search, endorse, track threads.
23 -
24 - ## Escalation Ladder
25 -
26 - ### 1. Warning
27 -
28 - Trigger: Unresolved flags older than 14 days, or a sustained pattern of unaddressed reports.
29 -
30 - Action: Private notification to the creator. Factual, not accusatory. "Your community has N unresolved flags. Here's what we're seeing. Please address them within 7 days."
31 -
32 - ### 2. Restricted
33 -
34 - Trigger: No meaningful moderation response after warning period.
35 -
36 - Action: New thread creation disabled for non-moderator accounts. Existing members can still post in existing threads. Creator receives a final notice explaining the restriction and what's needed to lift it.
37 -
38 - ### 3. Frozen
39 -
40 - Trigger: Continued inaction after restriction.
41 -
42 - Action: Community goes read-only. Creator and moderators can still take moderation actions (clear the backlog, ban bad actors, remove posts) to unfreeze. No new content from anyone until the mod queue is addressed.
43 -
44 - ### 4. Clean Slate Offer
45 -
46 - Trigger: Extended freeze with no moderation activity, or a community that has deteriorated beyond reasonable recovery.
47 -
48 - This is the final step before archival. MNW reaches out to the creator directly, privately, one last time.
49 -
50 - The conversation is:
51 - - Factual: here is what we're seeing, here are the numbers
52 - - Not accusatory: could be burnout, life circumstances, overwhelm — the reason doesn't matter
53 - - An offer, not an ultimatum: "We can reset your community to a clean state"
54 -
55 - **If accepted:**
56 - - All threads and posts are cleared
57 - - Community settings, categories, and customizations are preserved
58 - - A brief system notice is posted: "This community was restarted by its creator on [date]."
59 - - Nothing else. No explanation. No blame. No platform statement.
60 - - The creator's account, project, tier, payment history, fan relationships — all untouched
61 - - No moderation action is taken against the creator themselves
62 - - The reset is visible. Members who were active will notice. MNW says nothing further.
63 -
64 - **If declined:**
65 - - Community remains frozen
66 - - Creator can come back and accept the clean slate at any time
67 - - After extended inaction (90+ days frozen), community is archived
68 - - Even archived creators are never punished — but the community stays inactive
69 -
70 - ### 5. Archived
71 -
72 - Trigger: 90+ days frozen with no moderation activity and no clean slate accepted.
73 -
74 - Action: Community is archived. Content preserved but not interactive. Creator can request a clean slate at any point to reactivate.
75 -
76 - ## What MNW Never Does
77 -
78 - - Public announcement about why a community was reset or frozen
79 - - Marking the creator's profile, project page, or storefront
80 - - Discussing the situation with other creators or fans
81 - - Using moderation history as a factor in unrelated account decisions
82 - - Naming or shaming in any context
83 -
84 - ## Monitoring
85 -
86 - PoM should track:
87 - - Unresolved flag age per community (alert when oldest unresolved flag exceeds 14 days)
88 - - Flag-to-moderation-action ratio over a 30-day rolling window
89 - - Communities in restricted/frozen/archived state
90 -
91 - Escalation decisions are always human-reviewed. Automated monitoring identifies candidates; a person decides whether to act.
92 -
93 - ## Implementation Status
94 -
95 - - [x] Community suspension (admin.rs)
96 - - [x] Flag system with auto-hide threshold
97 - - [x] Mod action logging (19 action types)
98 - - [x] Ban/mute with duration support
99 - - [ ] Automated flag age monitoring in PoM
100 - - [ ] Restricted state (disable new thread creation)
101 - - [ ] Frozen state (read-only mode)
102 - - [ ] Clean slate mechanism (clear threads, post system notice, preserve settings)
103 - - [ ] Archived state with reactivation path
104 - - [ ] Public-facing documentation of this policy
@@ -1,101 +0,0 @@
1 - # Multithreaded — Todo
2 -
3 - Done: All pre-beta phases. Active: None. Next: Platform integration.
4 -
5 - v0.3.4. Audit grade A. 228 tests.
6 -
7 - ---
8 -
9 - ## Code Review Remediation — Remaining
10 - - [ ] Transitive dep advisories: rand 0.8/0.9 (RUSTSEC-2026-0097), rsa (RUSTSEC-2023-0071), lru (RUSTSEC-2026-0002) — no direct fix available, monitor upstream
11 -
12 - ---
13 -
14 - ## Platform Integration (Post-Beta)
15 -
16 - ### Default Categories
17 - Done:
18 - - [x] Issues: MT seeds an "issues" category in default communities; MNW `routes/postmark/issues.rs` spawns an MT thread per inbound issue (project-linked repos only) and routes email replies into that thread as posts. Issue row stores `mt_thread_id` for direct lookup.
19 - - [x] Patches: MT seeds a "patches" category in default communities; MNW `routes/postmark/patches.rs` already wired (auto-creates the category on demand for pre-step-6 communities).
20 -
21 - Still blocked on MNW Developer Services (crash reporting / feedback / dashboard, not yet built):
22 - - [ ] Crashes (crash reports from DS2)
23 - - [ ] Feedback (user feedback from DS3)
24 -
25 - ### Fan+ Feature Gating
26 -
27 - Depends on MNW shipping the `perks` object in `/oauth/userinfo` (see MNW server todo: "OAuth userinfo perks object"). Gating predicate: `user.perks.fan_plus || user.perks.is_creator`. Creator auto-grant falls out of the `is_creator` branch — no separate code path.
28 -
29 - Plumbing:
30 - - [x] Extend `SessionUser` with `perks: UserPerks { fan_plus, is_creator, creator_tier: Option<{ tier, features }> }` and `effective_plus()` helper
31 - - [x] `auth::refresh_session(state, session)` — reads cached access token, re-hits `/oauth/userinfo`, overwrites session perks. Flushes session on `401`, leaves intact on transient errors
32 - - [x] `POST /auth/refresh` route — JSON response with refreshed perks; `401` if not logged in, `502` on MNW transport/parse error
33 - - [x] Refactored userinfo fetch into reusable `fetch_userinfo`; callback handler now retries on transport only
34 -
35 - Gated features:
36 - - [x] Signatures (markdown + image, 1024 char cap, rendered below post body). Edit form at `/account`. Render-time visibility gated on current `users.is_fan_plus` — lapsed users keep the row but the signature hides until they renew.
37 - - [x] Image embeds in posts (markdown `![](...)`), gated via `render_markdown_plus` (strict + images permitted). Non-plus users get a 422 with a clear "Fan+ feature" message at submit time. Applies to thread bodies, replies, and footnotes.
38 - - [x] + badge in author display: shown only for users with active Fan+ subscription (creators with auto-grant do NOT get the badge — auto-grant covers editor capabilities, not the public badge).
39 - - [x] Denormalised `users.is_fan_plus` / `is_creator` columns (migration 026) mirror MNW perks; refreshed on login + `POST /auth/refresh`. Post-author lookup uses these via SQL JOIN — no per-post HTTP call.
40 - - [ ] Custom / larger profile images — **deferred**: MT pulls `avatar_url` from MNW. Forum-local avatar storage would be a separate feature; out of scope for current launch.
41 -
42 - ### Community Moderation Enforcement
43 -
44 - Two-layer auth model already exists: superadmin = `PLATFORM_ADMIN_ID` (single user), forum-level = `CommunityRole::{Owner, Moderator}`. State changes and clean-slate authorized by `is_mod_or_owner(role) || is_platform_admin(user)` — wrap as `require_mod_or_superadmin` helper to keep boilerplate down. No new permission concepts; more robust system deferred.
45 -
46 - - [x] Add `community.state` enum column: `Active | Restricted | Frozen | Archived` (migration 025, `CommunityState` in mt-core, `set_community_state` mutation)
47 - - [x] `require_mod_or_superadmin` / `is_mod_or_superadmin` / `is_platform_admin` helpers + `WriteScope` + `check_write_state` enforcement helper
48 - - [x] Restricted state: block new thread creation for non-mods; existing threads still accept replies
49 - - [x] Frozen state: read-only for everyone except mods/superadmin (blocks new threads, replies, footnotes, endorsements)
50 - - [x] Archived state: Frozen behavior + hidden from default `/` listing; exposed under `?filter=archived`; reactivation sets state back to Active
51 - - [x] State-change route: `POST /p/{slug}/settings/state` (owner/mod/superadmin); rejects unknown values with 422; logs `ModAction::ChangeCommunityState`
52 - - [x] Clean-slate mutation: transactional delete of all threads/posts (cascades through endorsements/flags/footnotes/etc.), preserves community/categories/memberships/bans/tags; posts a pinned+locked "Community reset by &lt;actor&gt; on &lt;date&gt;" thread in the first category
53 - - [x] Clean-slate UX: typed-phrase confirmation matching community slug (GitHub repo-delete style); 422 on mismatch
54 - - [x] Superadmin UX: dedicated `GET /_admin/communities/{slug}` view with state-change form + clean-slate danger zone; linked from the admin dashboard community table
55 - - [x] Moderation policy published at `MNW/server/site-docs/public/guide/moderation.md`; linked from MT footer
56 - - [x] `ModAction::CleanSlateCommunity` logged with deleted thread count + system thread ID for audit
57 -
58 - ### Notification Integration
59 - - [ ] Push mentions, replies, endorsements, flags to MNW notifications API
60 - - [ ] Read state synced with MNW notification center
61 -
62 - ---
63 -
64 - ## Deferred (Post-Beta)
65 -
66 - - [ ] Private communities (visibility flag, membership gating, hidden listing) — tabled; focus is project-oriented and creator-oriented public forums
67 - - [ ] E2E encrypted live chat (OpenMLS integration, WebSocket gateway)
68 - - [ ] Real-time thread updates via shared WebSocket gateway (shared with SyncKit realtime sync — single service)
69 - - [ ] Federation (ActivityPub or custom protocol)
70 - - [ ] Subcategories / nested categories
71 - - [ ] Similar thread detection on new thread creation
72 - - [ ] Suggested/related threads at bottom of thread view
73 - - [ ] Keyboard shortcuts beyond `/` for search
74 -
75 - ---
76 -
77 - ## Key Paths
78 -
79 - | What | Where |
80 - |------|-------|
81 - | Time formatting | `crates/mt-core/src/time_format.rs` |
82 - | DB queries | `crates/mt-db/src/queries.rs` |
83 - | DB mutations | `crates/mt-db/src/mutations.rs` |
84 - | Templates (Rust) | `src/templates/` |
85 - | Templates (HTML) | `templates/` |
86 - | Link previews | `src/link_preview.rs` |
87 - | S3 storage | `src/storage.rs` |
88 - | Route helpers | `src/routes/helpers.rs` |
89 - | Routes | `src/routes/` (mod.rs, helpers.rs, forum/{mod,views,thread,posts,actions}.rs, moderation.rs, settings.rs, admin.rs, flagging.rs, tracking.rs, search.rs, uploads.rs) |
90 - | Auth (OAuth) | `src/auth.rs` |
91 - | CSRF | `src/csrf.rs` |
92 - | Markdown | `docengine` crate (`shared/docengine/`) — features: mentions, quotes |
93 - | Config | `src/config.rs` |
94 - | Seed data | `src/seed.rs` |
95 - | Entry point | `src/main.rs` |
96 - | Library root | `src/lib.rs` |
97 - | Migrations | `migrations/` (001-024) |
98 - | CSS | `static/style.css` |
99 - | Deploy config | `deploy/` |
100 - | Integration tests | `tests/` |
101 - | Workspace config | `Cargo.toml` |
@@ -1,127 +0,0 @@
1 - # PoM (Peace of Mind) -- Audit History
2 -
3 - Full chronological audit log. See [audit_review.md](./audit_review.md) for current state.
4 -
5 - ## Changes Since Last Audit
6 -
7 - ### Thirteenth audit (2026-04-30, Run 17 cross-project)
8 - - **Test count:** 365 (236 unit + 129 integration). 0 clippy warnings. 0 failures.
9 - - **Grade:** A- (downgraded from A). v0.3.5. ~15,061 LOC.
10 - - **New cold spots:**
11 - - checks/cors.rs (B+) -- only 2 serde tests, zero coverage of actual validation logic
12 - - checks/ssh_banner.rs (B+) -- only 1 test, degraded-status branches untested
13 - - cli/tasks/whois.rs (B+) -- uses TLS interval config instead of dedicated WHOIS interval (copy-paste bug)
14 - - **Mandatory surprise:** RateLimiter atomic ordering race -- `try_acquire()` uses Relaxed ordering on counter, allowing max_per_window + N requests in a burst. Technically incorrect, harmless at current scale.
15 - - **Previous action items:** Item 2 (aws-lc-sys CVEs) still upstream-blocked. All resolved items verified intact.
16 - - **New action items:** 2 MEDIUM (CORS tests, SSH banner tests), 2 LOW (WHOIS config field, RateLimiter ordering fix).
17 -
18 - ### Tenth audit (2026-03-28, Run 12 cross-project)
19 - - **Test count:** 359 (222 unit + 8 cli + 129 integration). 0 clippy warnings. 0 failures.
20 - - **Grade:** A (maintained). v0.3.2.
21 - - **CORS monitoring:** New check type added for monitoring CORS headers on targets.
22 - - **New dependency advisories (action items):**
23 - - aws-lc-sys 0.38.0 (RUSTSEC-2026-0044 + -0048, severity 7.4 HIGH) — upgrade to 0.39.0 via `cargo update -p aws-lc-sys`
24 - - rustls-webpki 0.103.9 (RUSTSEC-2026-0049) — upgrade to 0.103.10 via `cargo update -p rustls-webpki`
25 - - paste unmaintained (RUSTSEC-2024-0436) — upstream via rmcp, warning only
26 - - **Mandatory surprise:** None. Previous surprises (rate limiter relaxed ordering, write!().unwrap() infallibility) still valid.
27 - - **No new code findings.** All previous items remain resolved.
28 -
29 - ### DNS/Route stale data fix (2026-03-25)
30 - - **Test count:** 352 (unchanged). 0 clippy warnings.
31 - - **Config:** Switched all 4 Cloudflare-proxied DNS records from `expected = ["IP"]` to `expected = []` (resolution-only). DNS checks were always failing because Cloudflare returns rotating proxy IPs, not the origin IP.
32 - - **API filtering:** `route_status` and `dns_status` in `/api/status/{target}` now filtered to only entries matching current config. Stale routes (e.g. `/docs/about`, `/signup`) and stale DNS records no longer appear in API responses.
33 - - **DB pruning:** Added `prune_stale_routes()` and `prune_stale_dns()` to `db.rs`. Called once at task startup in `routes.rs` and `dns.rs` to clean up historical data when config changes. Pruned 890 stale route check rows on first deploy.
34 - - **Integration tests:** Updated `api_status_includes_route_status` and `api_status_includes_dns_status` to use configs with matching route/DNS entries.
35 - - **Deployed to hetzner** — v0.3.2 binary + updated config.
36 -
37 - ### Eighth audit (2026-03-18, Run 9 cross-project)
38 - - **Test count:** 344 (unchanged). 0 clippy warnings.
39 - - **Grade:** A (maintained). v0.3.1 (deployed 2026-03-18).
40 - - **Dashboard UI shipped.** Per-test tracking, regression detection, duration drift.
41 - - **cli/ directory module split** completed (1,035-line cli.rs -> 8 files).
42 - - **Observations (pre-existing, not regressions):**
43 - - Mutex `.unwrap()` in rate limiter (api.rs:41) — if thread panics while holding lock, subsequent calls panic. Impact: LOW (rate limiter only, not core logic). Design choice: acceptable for monitoring tool.
44 - - `serde_json::to_value(d).unwrap_or_default()` in API details field — silently becomes null on serialization failure. Impact: LOW, safe fallback.
45 - - **No new findings requiring action.** Grade maintained at A.
46 - - **Mandatory surprise:** Rate limiter uses `fetch_add` with Relaxed ordering — can allow up to max_per_window+1 requests due to check-then-increment race. Known trade-off of lock-free rate limiting, documented.
47 -
48 - ### Fifth audit (2026-03-16, Run 6 cross-project)
49 - - **Test count:** 238 -> 344 (220 unit + 124 integration, +106 tests)
50 - - **Grade:** A (maintained). No new findings above LOW.
51 - - **Source LOC:** 10,113 (up from ~3.5K)
52 - - **Clippy:** 2 warnings (collapsible_if in cli.rs — LOW)
53 - - **Production unwraps:** 76 total — 64 infallible write! on String, 12 safe-by-construction. Effectively zero risky unwraps.
54 - - **Mandatory surprise:** write!().unwrap() pattern provably infallible — Actually fine.
55 - - **Previous items verified:** All previous remediated items confirmed intact.
56 - - **Note:** cli.rs at 1,036 lines — approaching the 500-line branching guideline but mostly flat match arms.
57 - - **Infrastructure check:** Blocked by Tailscale SSH re-authentication. Deferred.
58 -
59 - ### Fourth audit remediation (2026-03-14)
60 - - **Grade:** A- -> A. All remaining findings resolved.
61 - - **Test count:** 229 -> 238 (+9 integration tests)
62 - - **Graceful shutdown:** Replaced `handle.abort()` with CancellationToken + `tokio::select!` in all task loops. API server uses `with_graceful_shutdown`. 5s grace period on SIGINT/SIGTERM.
63 - - **Task panic detection:** 60s watchdog checks `JoinHandle::is_finished()` on all background tasks.
64 - - **Rate limiting:** Fixed-window 60 req/min middleware on authenticated API routes. Custom `RateLimiter` struct.
65 - - **Self-monitoring:** `GET /api/health` endpoint (public, no auth) returns `{"status":"operational","version":"..."}`.
66 - - **Integration tests:** 5 check_health tests (mock axum servers: operational, degraded, unreachable, expectations pass/fail), 1 check_tls test (self-signed cert via rcgen), 2 /api/health tests, 1 rate limiter test.
67 - - **Deploy config cleanup:** Removed redundant htpy `expected_routes` (duplicated health check URL).
68 - - **Dependency:** Added `tokio-util` for CancellationToken.
69 - - **Cold spots:** 0 remaining (was 3). All previous architectural and testing gaps closed.
70 -
71 - ### Third audit (2026-03-13, pre-launch skeptical lens)
72 - - **Grade:** A -> A-. Postmark API token in plaintext deployment configs is a real issue.
73 - - **Test count:** 56 -> 187 (+131 tests)
74 - - **New findings:** Plaintext API token, no API auth, no peer mesh auth, no integration tests for core functions, no self-monitoring.
75 - - **38 unwraps in non-test code** — all verified safe (write to String or guarded by prior checks).
76 -
77 - **Post-audit remediation (2026-03-13):**
78 - - All 3 critical/medium findings resolved: Postmark token to env var, API bearer auth (5 tests), peer mesh auth
79 - - 2 low findings resolved: SSH filter validation, peer UUID mismatch rejection
80 - - Test count: 187 -> 195 (+8 tests)
81 - - Documentation upgraded to A: All struct fields documented (HealthSnapshot, HealthStatus, HealthDetails, TestRun, TestStaleness, PeerStatus, OnMissing, all config types, all API response types). All 8 error variants documented. 11 config defaults with rationale comments. prune_old_records return tuple documented. description.md rewritten, architecture.md created (191 lines), README created (62 lines).
82 -
83 - ### Observability Upgrade (2026-03-13)
84 - - **Observability:** A- -> A
85 - - Added 57 `#[instrument(skip_all)]` annotations across 9 files: db.rs (28), alerts.rs (9), tools/mod.rs (8), tools/health.rs (5), tools/tests.rs (3), checks/http.rs (1), checks/tls.rs (1), checks/ssh.rs (1), peer.rs (1)
86 - - Added Multithreaded forum as monitoring target: `pom-astra.toml` (localhost:3400), `pom-hetzner.toml` (Tailscale IP)
87 - - Added test runner targets for GO, BB, AF, SK to `pom-astra.toml`
88 - - All 208 tests pass. `cargo check` passes clean.
89 -
90 - ### Adversarial Test Audit (2026-03-13)
91 -
92 - **Goal:** Write tests that try to break the system. Find edge cases, race conditions, boundary conditions, and logic errors.
93 -
94 - **Results:**
95 - - **Test count:** 195 -> 208 (+13 tests)
96 - - **CRITICAL fix:** Alert cooldown key mismatch — `record_alert` used `target` but lookup used `alert_key` (`"health:{target}"`), so cooldowns never matched and alerts fired every check. Fixed by using `alert_key` consistently.
97 - - **HIGH fix:** TLS expiry check inconsistent at day boundary — time-of-day comparison could cause flapping. Changed to `date_naive()` comparison for stable day-level logic.
98 - - **HIGH fix:** UUID mismatch left stale peer state — now resets state, clears failures, persists via `update_peer_identity()` to prevent showing stale data after peer identity change.
99 - - **HIGH fix:** `prune_old_records` no guard for days <= 0 — could delete all records. Added early return for `days <= 0` (no-op).
100 - - **HIGH fix:** SSH timeout ignored config value — hardcoded `ConnectTimeout=10` in SSH args. Changed to use `config.timeout_secs`.
101 - - **Added `rcgen` dev dependency** for TLS cert generation in tests.
102 -
103 - ### Second audit (2026-03-11)
104 - | Change | Detail |
105 - |--------|--------|
106 - | Tests | +39 tests (17 -> 56). 28 unit + 28 integration. Tests/KLOC: 5.8 -> 18.4. |
107 - | Lock contention | Addressed in both peer.rs (heartbeat handlers) and api.rs (status/mesh handlers). Data collected under lock, DB writes after release. |
108 - | DB indexes | 4 indexes added: health_checks(target, id DESC), health_checks(target, checked_at), test_runs(target, id DESC), peer_heartbeats(peer_name, id DESC). |
109 - | Clippy | 4 warnings -> 0. Used Rust 2024 let chains instead of nested if-let. |
110 - | Type safety | PeerConfig.on_missing changed from String to OnMissing enum with serde deserialization. |
111 - | Module docs | Added //! docs to db.rs, config.rs, peer.rs, types.rs, lib.rs. |
112 - | Error handling | /api/peer/status fetch failures now logged at debug level instead of silenced. |
113 - | Prune | prune_old_records now returns 3-tuple including peer heartbeat count. |
114 - | Code extraction | HealthStatus::icon() method eliminates 3 repeated match blocks. |
115 - | HTTP checks | Response classification extracted into pure functions for testability. |
116 -
117 - ## Metrics Over Time
118 -
119 - | Audit Date | LOC | Rust Files | Tests | Tests/KLOC | Clippy Warnings | Cold Spots | Overall |
120 - |------------|-----|-----------|-------|-----------|----------------|------------|---------|
121 - | 2026-03-10 | 2,934 | 15 | 17 | 5.8 | 4 | 8 | B+ |
122 - | 2026-03-11 | 3,039 | 14 | 56 | 18.4 | 0 | 3 | A |
123 - | 2026-03-13 | ~3K | ~14 | 208 | ~69 | 0 | 3 | A- |
124 - | 2026-03-14 | ~3.5K | ~16 | 238 | ~68 | 0 | 0 | A |
125 - | 2026-03-16 | 10.1K | 23 | 344 | ~34 | 2 | 0 | A |
126 - | 2026-03-18 | 10.1K | 23 | 344 | ~34 | 0 | 0 | A |
127 - | 2026-04-30 | ~15,061 | -- | 365 | ~15.6 | 0 | 3 | A- |
@@ -1,139 +0,0 @@
1 - # PoM (Peace of Mind) -- Audit Review
2 -
3 - **Last audited:** 2026-04-30 (thirteenth audit, Run 17 cross-project)
4 - **Previous audit:** 2026-04-18 (twelfth audit, Run 15 cross-project)
5 -
6 - ## Overall Grade: A
7 -
8 - Run 17 cross-project audit (remediated 2026-05-01). 376 tests (247 unit + 129 integration, all pass). 0 clippy warnings. v0.3.5. ~15,061 LOC. All 4 Run 17 findings fixed: CORS validation extracted and tested (10 tests), SSH banner branches tested (3 tests), WHOIS interval config field added, RateLimiter atomic ordering corrected.
9 -
10 - ## Scorecard
11 -
12 - | Dimension | Grade | Notes |
13 - |-----------|:-----:|-------|
14 - | Code Quality | A | Zero clippy warnings. Clean error handling with typed PomError enum. All migrations have IF NOT EXISTS guards. |
15 - | Architecture | A | Well-structured single crate. main.rs is thin, CLI handlers extracted to cli.rs. Module boundaries clean. |
16 - | Testing | A | 376 tests (247 unit + 129 integration), ~16 tests/KLOC. CORS and SSH banner gaps fixed. |
17 - | Security | A | Constant-time auth, SSH command injection prevention, response size caps. |
18 - | Performance | A- | WAL mode, proper indexes. Sequential per-target checks intentional. |
19 - | Documentation | A- | Module-level docs on all modules. Field-level docs on struct fields. |
20 - | Dependencies | A- | All deps latest stable. rmcp is 0.1.x (pre-1.0 but current). |
21 - | Type Safety | A | Domain enums with proper serde. No stringly-typed fields. |
22 - | Observability | A | tracing with #[instrument] on all DB/check functions. |
23 - | Concurrency | A | Clean lock ordering in peer mesh. Rate limiter atomic ordering fixed (Acquire/Release). |
24 - | Resilience | A | Graceful shutdown, timeouts on all network calls, watchdog. |
25 - | Codebase Size | A- | 15K LOC for full monitoring tool. |
26 -
27 - ## Module Heatmap
28 -
29 - | Module | Code | Arch | Test | Security | Perf | Docs | Type Safety | Concurrency | Resilience |
30 - |--------|:----:|:----:|:----:|:--------:|:----:|:----:|:-----------:|:-----------:|:----------:|
31 - | main | A | A | A | A | A | A | A | A | A |
32 - | cli | A | A | A | A | A | A | A | A | A |
33 - | peer | A | A | A | A | A | A | A | A | A |
34 - | db | B+ | A | A | A+ | A | A | A- | A | A |
35 - | api | A | A | A | A | A | A | A | A | A |
36 - | config | A | A | A | A | A | A | A | - | - |
37 - | tools | A | A | A | A | A | A | A | - | A |
38 - | checks | A | A | A | A | A | A | A | - | A |
39 - | display | A | A | A | - | - | A | A | - | - |
40 - | types | A | A | A | - | - | A | A+ | - | - |
41 - | alerts | A | A | A | A | A | A | A | - | A |
42 -
43 - ### Cold Spots
44 -
45 - All resolved (2026-05-01):
46 -
47 - 1. ~~**checks/cors.rs -- B+**~~ -- Fixed. Extracted `evaluate_preflight` pure function, added 10 unit tests covering origin matching, wildcards, method case-insensitivity, failure combinations.
48 - 2. ~~**checks/ssh_banner.rs -- B+**~~ -- Fixed. Added 3 tests using real TCP listeners (valid banner, unexpected banner, empty response).
49 - 3. ~~**cli/tasks/whois.rs -- B+**~~ -- Fixed. Added `whois_check_interval_secs` config field (default 86400s/24h), corrected the reference.
50 -
51 - ## Strengths
52 -
53 - - **Lean architecture.** ~15,061 LOC delivers health checks, test orchestration, MCP server, HTTP API, peer mesh, CLI, alerting, TLS monitoring, route checks, and self-monitoring.
54 - - **Clean error handling.** `?` propagation with typed PomError enum (thiserror, 8 variants). No panics in production paths.
55 - - **Correct security posture.** All SQL parameterized. Bearer token auth on API + peer mesh. Rate limiting (60 req/min). SSH hardened with `BatchMode=yes` + `--` separator.
56 - - **Backward-compatible design.** All config sections are `#[serde(default)]`. Existing configs work unchanged.
57 - - **Good MCP integration.** 8 MCP tools with clear descriptions. Tools strip `raw_output` from test history to avoid flooding context.
58 - - **Comprehensive test suite.** 376 tests covering DB, API, MCP tools, parsing, peer state machine, config, health classification, mock server health checks, TLS cert validation, CORS validation, SSH banner, and rate limiting.
59 - - **Excellent lock discipline.** Peer module collects data under lock, releases lock, then performs DB writes. No lock contention under load.
60 - - **Robust shutdown.** CancellationToken with `tokio::select!` in all task loops. Graceful API server shutdown. Watchdog for silent panics. 5s grace period.
61 -
62 - ## Weaknesses
63 -
64 - ### 1. ~~Migration idempotency bug~~ (Incorrect finding)
65 - All 9 migrations correctly use `CREATE TABLE IF NOT EXISTS`. This finding was incorrect in the original Run 15 audit. Verified by reading db.rs on 2026-04-22.
66 -
67 - ### 2. aws-lc-sys critical CVEs
68 - 2 critical CVEs in aws-lc-sys transitive dependency. Blocked on upstream fix. No direct exposure but dependency audit tools flag it.
69 -
70 - ## Mandatory Surprise
71 -
72 - **RateLimiter atomic ordering race condition**
73 -
74 - The RateLimiter has a subtle race condition. In `api.rs`, `try_acquire()` holds the mutex on `window_start` while doing the window reset check, but uses `Relaxed` ordering on the atomic counter. Between dropping the mutex and the `fetch_add`, another thread could see the old window valid but counter already reset. Could allow max_per_window + N requests in a window. Harmless at this scale but technically incorrect.
75 -
76 - **Verdict:** Fixed 2026-05-01. Changed to Acquire/Release ordering.
77 -
78 - ### Previous Surprises
79 -
80 - - **~~Migration idempotency bug~~ (Incorrect finding)** -- All 9 migrations have always used `CREATE TABLE IF NOT EXISTS`. Verified 2026-04-22.
81 - - **Peer mesh dual HTTP call per heartbeat** -- intentional and well-designed separation of heartbeat probe from status data fetch. Verdict: Actually fine.
82 - - **Rate limiter Relaxed ordering** (Run 9) -- can allow up to max_per_window+1 requests. Known trade-off, now elevated to mandatory surprise with fuller analysis.
83 -
84 - ## Action Items
85 -
86 - Filed in `docs/mnw/pom/todo.md`.
87 -
88 - ### Run 17 (2026-04-30, remediated 2026-05-01)
89 - 1. ~~**[MEDIUM]** Add tests for CORS validation logic~~ -- Done. Extracted `evaluate_preflight`, 10 unit tests.
90 - 2. ~~**[MEDIUM]** Add tests for SSH banner check~~ -- Done. 3 tests with real TCP listeners.
91 - 3. ~~**[LOW]** Add dedicated whois_check_interval_secs config field~~ -- Done. Default 86400s (24h).
92 - 4. ~~**[LOW]** Fix RateLimiter atomic ordering~~ -- Done. Relaxed → Acquire/Release. 4 rate limiter unit tests added.
93 -
94 - ### Run 15 (2026-04-18, corrected 2026-04-22)
95 - 5. ~~**[MEDIUM]** Add IF NOT EXISTS to migrations 3-6~~ -- Incorrect finding. All migrations already have guards.
96 - 6. **[LOW]** Monitor aws-lc-sys for CVE fixes (blocked on upstream)
97 -
98 - ### All resolved (previous audits)
99 - 3. ~~**[CRITICAL]** Remove Postmark API token from deployment configs~~ -- Done
100 - 4. ~~**[MEDIUM]** Add API authentication~~ -- Done
101 - 5. ~~**[MEDIUM]** Add peer mesh authentication~~ -- Done
102 - 6. ~~**[MEDIUM]** SSH `--` separator~~ -- Done
103 - 7. ~~**[LOW]** Add integration tests for core functions~~ -- Done
104 - 8. ~~**[LOW]** Add self-monitoring~~ -- Done
105 - 9. ~~**[LOW]** Shell-escape SSH test filter parameter~~ -- Done
106 - 10. ~~**[LOW]** Reject peer UUID mismatch~~ -- Done
107 - 11. ~~**[LOW]** HTTP response size limit~~ -- Done
108 -
109 - ### First/Second Audit -- All resolved
110 - - Extract CLI command handlers -- Done
111 - - Add typed PomError enum -- Done
112 - - Add .DS_Store/.idea/.vscode to .gitignore -- Done
113 - - Add module-level //! docs -- Done
114 - - Add migration versioning -- Done
115 - - Add DB indexes -- Done
116 - - Fix clippy warnings -- Done
117 - - Decouple mesh lock from DB writes -- Done
118 - - Add API/heartbeat/config tests -- Done
119 -
120 - ## Metrics Over Time
121 -
122 - | Audit Date | LOC | Rust Files | Tests | Tests/KLOC | Clippy Warnings | Cold Spots | Overall |
123 - |------------|-----|-----------|-------|-----------|----------------|------------|---------|
124 - | 2026-03-10 | ~3,500 | ~12 | 0 | 0 | 4 | 8 | B |
125 - | 2026-03-11 (remediation) | ~3,500 | ~12 | 57 | ~16 | 0 | 0 | A- |
126 - | 2026-03-13 (pre-launch) | ~3,800 | ~14 | 152 | ~40 | 0 | 0 | A |
127 - | 2026-03-14 (deep dive) | ~3,800 | ~14 | 152 | ~40 | 0 | 0 | A |
128 - | 2026-03-16 (Run 6) | ~5,900 | ~14 | 238 | ~40 | 0 | 0 | A |
129 - | 2026-03-18 (Run 9) | ~10,000 | ~14 | 359 | ~36 | 0 | 0 | A |
130 - | 2026-03-28 (Run 12) | ~10,000 | ~14 | 359 | ~36 | 0 | 0 | A |
131 - | 2026-04-15 (Run 14) | ~11,496 | -- | 364 | ~32 | 0 | 0 | A |
132 - | 2026-04-18 (Run 15) | ~11,496 | -- | 364 | ~32 | 0 | 2 | A- |
133 - | 2026-04-22 (Run 15 corrected) | ~11,496 | -- | 364 | ~32 | 0 | 1 | A |
134 - | 2026-04-30 (Run 17) | ~15,061 | -- | 365 | ~15.6 | 0 | 3 | A- |
135 - | 2026-05-01 (Run 17 remediated) | ~15,061 | -- | 376 | ~16 | 0 | 0 | A |
136 -
137 - ---
138 -
139 - See [audit_history.md](./audit_history.md) for full chronological audit log.
@@ -1,90 +0,0 @@
1 - # PoM -- Competitive Analysis
2 -
3 - Last updated: 2026-04-02
4 -
5 - ## Positioning
6 -
7 - PoM (Peace of Mind) is a single-binary production monitor built for indie developers and small teams. It runs as a peer mesh -- two instances cross-check each other with no central dashboard required. CLI-first, with an optional HTTP API and Claude integration (MCP server mode).
8 -
9 - The key differentiators are the peer mesh architecture (no single point of failure for monitoring), the CLI-first interface (inspect via SSH, no browser needed), and the Claude MCP integration (AI-assisted diagnostics). PoM monitors what matters for small deployments: uptime, TLS certificates, DNS records, domain registration, route availability, and test freshness.
10 -
11 - ## Pricing Comparison
12 -
13 - | Tool | Price | Model |
14 - |------|-------|-------|
15 - | **PoM** | Free | Source-available (PolyForm NC) |
16 - | Uptime Robot | $0-$58/mo | Freemium (50 monitors free) |
17 - | Pingdom | $15-$100/mo | SaaS |
18 - | Datadog | $15-$23/host/mo | SaaS |
19 - | New Relic | $0-$0.35/GB | Freemium |
20 - | Grafana + Prometheus | Free (self-host) | Open source |
21 - | StatusCake | $0-$67/mo | Freemium |
22 - | Hetrix Tools | $0-$20/mo | Freemium |
23 -
24 - ## Feature Matrix
25 -
26 - | Feature | PoM | Uptime Robot | Pingdom | Datadog | Grafana+Prom |
27 - |---------|:---:|:-----------:|:-------:|:-------:|:------------:|
28 - | HTTP health checks | Y | Y | Y | Y | Y |
29 - | TLS certificate monitoring | Y | Y | Y | Y | N* |
30 - | DNS record verification | Y | N | N | Y | N* |
31 - | WHOIS domain expiry | Y | N | N | N | N* |
32 - | Route availability checks | Y | N | Y | Y | N* |
33 - | CORS preflight checks | Y | N | N | N | N |
34 - | Peer mesh (cross-monitoring) | Y | N | N | N | N |
35 - | CLI-first interface | Y | N | N | N | N |
36 - | Claude MCP integration | Y | N | N | N | N |
37 - | SSH test execution | Y | N | N | N | N |
38 - | Latency drift detection | Y | N | Y | Y | Y |
39 - | Test duration drift | Y | N | N | N | N |
40 - | Email alerts | Y | Y | Y | Y | Y |
41 - | Status page | N | Y | Y | Y | Y** |
42 - | Mobile app | N | Y | Y | Y | Y** |
43 - | APM / traces | N | N | N | Y | Y |
44 - | Log aggregation | N | N | N | Y | Y |
45 - | Self-hosted | Y | N | N | N | Y |
46 - | Single binary | Y | N/A | N/A | N/A | N |
47 -
48 - \* Requires additional exporters. \*\* Via Grafana dashboards.
49 -
50 - ## Competitor Deep Dives
51 -
52 - ### 1. Uptime Robot
53 -
54 - Simple uptime monitoring SaaS. Free tier with 50 monitors at 5-minute intervals. Pro adds 1-minute intervals, SSL monitoring, status pages. The default choice for indie developers.
55 -
56 - **What PoM lacks:** status pages, mobile app, SMS/Slack/webhook alerts, maintenance windows. **What Uptime Robot lacks:** peer mesh, CLI interface, DNS/WHOIS monitoring, SSH test execution, AI integration.
57 -
58 - ### 2. Datadog
59 -
60 - Enterprise observability platform (APM, logs, metrics, dashboards). Powerful but expensive and invasive (requires agents on every host). Overkill for small deployments.
61 -
62 - **What PoM lacks:** APM, distributed tracing, dashboards, log aggregation, 800+ integrations. **What Datadog lacks:** peer mesh, CLI-first operation, single binary simplicity, affordability for indie teams.
63 -
64 - ### 3. Grafana + Prometheus
65 -
66 - Open-source metrics and visualization stack. Extremely flexible, industry standard. Requires significant setup (Prometheus server, exporters, Grafana instance, alertmanager). No built-in TLS/DNS/WHOIS monitoring without custom exporters.
67 -
68 - **What PoM lacks:** rich dashboards, metric visualization, alertmanager flexibility, ecosystem of exporters. **What Grafana+Prom lacks:** out-of-box TLS/DNS/WHOIS, peer mesh, single binary, zero-config setup.
69 -
70 - ### 4. StatusCake
71 -
72 - Web-based uptime and page speed monitoring. Free tier with 10 monitors. Pro adds SSL, domain, and server monitoring. Similar scope to Uptime Robot but with more check types.
73 -
74 - **What PoM lacks:** page speed testing, server monitoring agents, status pages, Slack/Teams integration.
75 -
76 - ## What We Offer That Competitors Don't
77 -
78 - - **Peer mesh** -- two PoM instances monitor each other. If one goes down, the other detects it. No central dashboard is a single point of failure.
79 - - **CLI-first** -- inspect status, run checks, query history from the terminal via SSH. No browser required.
80 - - **Claude MCP integration** -- expose health checks, test execution, and mesh status as MCP tools for AI-assisted diagnostics.
81 - - **SSH test execution** -- trigger and parse CI test runs on remote servers, track test freshness and duration drift.
82 - - **Single binary, zero dependencies** -- no Docker, no external services, no agents. SQLite for history, Postmark for email alerts.
83 - - **Monitoring-offline meta-alert** -- detects when all targets are unreachable simultaneously (likely a PoM network issue, not actual outages). Prevents false alarm cascades.
84 -
85 - ## Target Users
86 -
87 - - Indie developers running 1-5 services who want monitoring without SaaS costs
88 - - Small teams that operate via SSH and prefer CLI tools over web dashboards
89 - - Anyone who wants peer-verified monitoring (not trusting a single monitoring vendor)
90 - - Claude Code users who want AI-assisted production diagnostics
@@ -1,202 +0,0 @@
1 - # PoM Operational Runbook
2 -
3 - Procedures for responding to alerts, managing the service, and troubleshooting common issues.
4 -
5 - ## Alert Response Guide
6 -
7 - ### Health Status Change (Operational -> Error/Unreachable)
8 -
9 - **Symptoms:** Email alert with target status change.
10 -
11 - **Steps:**
12 - 1. Verify manually: `curl -v https://makenot.work/api/health`
13 - 2. If **Unreachable**: check network (Tailscale, firewall, DNS resolution)
14 - 3. If **Error** (5xx): SSH into the target server, check application logs
15 - ```sh
16 - ssh root@100.120.174.96 journalctl -u makenotwork --since "10 minutes ago"
17 - ```
18 - 4. If **Degraded** (2xx but unexpected body): check application state, database connectivity
19 - 5. Restart the service if needed: `ssh root@100.120.174.96 systemctl restart makenotwork`
20 -
21 - ### TLS Certificate Expiry
22 -
23 - **Symptoms:** Alert when certificate expires within 14 days.
24 -
25 - **Steps:**
26 - 1. Verify: `openssl s_client -connect makenot.work:443 2>/dev/null | openssl x509 -noout -dates`
27 - 2. Cloudflare Origin CA certs (15-year): no renewal needed. If alert fires, check Caddy config.
28 - 3. If Caddy is serving wrong cert: verify cert paths in `/etc/caddy/Caddyfile`
29 - 4. For custom domains (on-demand TLS): Caddy auto-renews via ACME. Check Caddy logs.
30 -
31 - ### TLS Check Failed
32 -
33 - **Symptoms:** Handshake timeout, certificate parse failure, or connection refused.
34 -
35 - **Steps:**
36 - 1. Verify: `openssl s_client -connect makenot.work:443 -servername makenot.work`
37 - 2. Check Caddy status: `ssh root@100.120.174.96 systemctl status caddy`
38 - 3. Check if port 443 is open: `ssh root@100.120.174.96 ss -tlnp | grep 443`
39 - 4. If Caddy is down, restart: `ssh root@100.120.174.96 systemctl restart caddy`
40 -
41 - ### Peer Missing
42 -
43 - **Symptoms:** Peer (astra or hetzner) unreachable for 3+ consecutive heartbeats (3+ minutes).
44 -
45 - **Steps:**
46 - 1. SSH into the peer: `ssh max@100.106.221.39` (astra) or `ssh root@100.120.174.96` (hetzner)
47 - 2. Check PoM service: `systemctl status pom`
48 - 3. Check Tailscale connectivity: `tailscale ping <peer-ip>`
49 - 4. If PoM is down: `systemctl restart pom`
50 - 5. If Tailscale is down: `systemctl restart tailscored`
51 -
52 - ### Latency Drift
53 -
54 - **Symptoms:** Sustained response time increase (>2x the 7-day baseline).
55 -
56 - **Steps:**
57 - 1. Check server load: `ssh root@100.120.174.96 top -bn1 | head -5`
58 - 2. Check PostgreSQL: `ssh root@100.120.174.96 "psql -c 'SELECT count(*) FROM pg_stat_activity;' makenotwork"`
59 - 3. Check for slow queries: `ssh root@100.120.174.96 "psql -c \"SELECT query, calls, mean_exec_time FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 5;\" makenotwork"`
60 - 4. Check disk I/O: `ssh root@100.120.174.96 iostat -x 1 3`
61 - 5. If database-related: consider `VACUUM ANALYZE` on affected tables
62 -
63 - ### Route Failure
64 -
65 - **Symptoms:** Specific paths (e.g., `/login`, `/docs`) returning non-2xx.
66 -
67 - **Steps:**
68 - 1. Verify: `curl -sI https://makenot.work/login`
69 - 2. If 502/503: application is down or Caddy can't reach it
70 - 3. If 404: route may have been removed in a deploy -- check recent deploys
71 - 4. If 500: application error -- check logs with `journalctl -u makenotwork`
72 -
73 - ### DNS Mismatch
74 -
75 - **Symptoms:** DNS records don't match expected values.
76 -
77 - **Steps:**
78 - 1. Verify: `dig makenot.work +short` and compare with expected
79 - 2. Check Cloudflare DNS dashboard for unexpected changes
80 - 3. If propagation issue: wait 5-10 minutes and recheck
81 - 4. If intentional change: update PoM config to match new expected values
82 -
83 - ### WHOIS Domain Expiry
84 -
85 - **Symptoms:** Domain registration expires within 30 days.
86 -
87 - **Steps:**
88 - 1. Verify: `whois makenot.work | grep -i expir`
89 - 2. Renew domain with registrar (Cloudflare Registrar for makenot.work)
90 - 3. Confirm renewal: re-run WHOIS check
91 -
92 - ### Monitoring Offline (All Targets Unreachable)
93 -
94 - **Symptoms:** All monitored targets are down simultaneously.
95 -
96 - **Steps:**
97 - 1. This almost certainly means PoM's own network is down, not all targets
98 - 2. Check the PoM instance's network: `ping 1.1.1.1`, `tailscale status`
99 - 3. Check DNS resolution: `dig makenot.work`
100 - 4. If network is fine, check if all targets actually are down (unlikely but possible)
101 -
102 - ### Test Run Stale
103 -
104 - **Symptoms:** No test run recorded in 7+ days.
105 -
106 - **Steps:**
107 - 1. SSH into astra and run tests manually: `/home/max/staging/run-tests.sh`
108 - 2. If tests fail: investigate failures, fix, re-run
109 - 3. If SSH test execution fails: check SSH key, connectivity, permissions
110 -
111 - ## Service Management
112 -
113 - ### Starting/Stopping
114 -
115 - ```sh
116 - # Hetzner
117 - ssh root@100.120.174.96 systemctl start pom
118 - ssh root@100.120.174.96 systemctl stop pom
119 - ssh root@100.120.174.96 systemctl restart pom
120 -
121 - # Astra
122 - ssh max@100.106.221.39 sudo systemctl start pom
123 - ssh max@100.106.221.39 sudo systemctl stop pom
124 - ssh max@100.106.221.39 sudo systemctl restart pom
125 - ```
126 -
127 - ### Checking Status
128 -
129 - ```sh
130 - # Service status
131 - ssh root@100.120.174.96 systemctl status pom
132 -
133 - # Application logs
134 - ssh root@100.120.174.96 journalctl -u pom --since "1 hour ago"
135 -
136 - # API health
137 - curl http://100.120.174.96:9100/api/health
138 -
139 - # Full status (requires API token)
140 - curl -H "Authorization: Bearer <token>" http://100.120.174.96:9100/api/status
141 -
142 - # Mesh view (self + peers)
143 - curl -H "Authorization: Bearer <token>" http://100.120.174.96:9100/api/mesh
144 - ```
145 -
146 - ### Deploying Updates
147 -
148 - ```sh
149 - cd ~/Code/MNW/pom
150 - ./deploy/deploy.sh # Deploy to both astra and hetzner
151 - ```
152 -
153 - The deploy script cross-compiles for both architectures, uploads binaries, and restarts services.
154 -
155 - ### Configuration Changes
156 -
157 - Config lives at `/etc/pom/pom.toml` on each instance. After editing:
158 -
159 - ```sh
160 - ssh root@100.120.174.96 systemctl restart pom
161 - ```
162 -
163 - Alert credentials are in `/etc/pom/env` (Postmark token, API token).
164 -
165 - ## Check Intervals
166 -
167 - | Check Type | Default Interval | Notes |
168 - |------------|-----------------|-------|
169 - | Health (HTTP) | 5 minutes | 10-second timeout per request |
170 - | TLS certificate | 1 hour | Warns at 14 days before expiry |
171 - | Route availability | 5 minutes | Checks all configured paths |
172 - | DNS records | 1 hour | Compares against expected values |
173 - | WHOIS expiry | 1 hour | Warns at 30 days before expiry |
174 - | CORS preflight | 1 hour | OPTIONS request validation |
175 - | Peer heartbeat | 60 seconds | 3 failures before alert (grace period) |
176 - | Data pruning | Daily | Retains 30 days of history |
177 -
178 - ## Alert Cooldowns
179 -
180 - - **Default cooldown:** 5 minutes between repeated alerts for the same target
181 - - **Recovery alerts:** Always sent immediately (bypass cooldown)
182 - - **Monitoring-offline:** Special meta-alert when all targets are unreachable
183 -
184 - ## Production Instances
185 -
186 - | Instance | IP | Architecture | Config |
187 - |----------|-----|-------------|--------|
188 - | Hetzner | `100.120.174.96:9100` | x86_64 | `/etc/pom/pom.toml` |
189 - | Astra | `100.106.221.39:9100` | aarch64 | `/etc/pom/pom.toml` |
190 -
191 - Both instances monitor the same targets and cross-check each other via the peer mesh.
192 -
193 - ## Key Files
194 -
195 - | What | Where |
196 - |------|-------|
197 - | Config | `/etc/pom/pom.toml` |
198 - | Credentials | `/etc/pom/env` |
199 - | Database | `/var/lib/pom/pom.db` (SQLite) |
200 - | Instance ID | `/var/lib/pom/instance_id` |
201 - | systemd unit | `/etc/systemd/system/pom.service` |
202 - | Deploy script | `deploy/deploy.sh` |
@@ -1,36 +0,0 @@
1 - # PoM Todo
2 -
3 - Done: All phases (1-13). Active: None. Next: Post-beta items below.
4 -
5 - v0.3.5. Audit grade A. 376 tests (247 unit + 129 integration).
6 -
7 - ## Notification Integration
8 - - [ ] Push PoM alerts to MNW notifications API (health failures, TLS expiry, DNS changes)
9 - - [ ] Deduplicate alert delivery (email via MNW notification preferences instead of direct Postmark)
10 -
11 - ## Deferred
12 - - [ ] Multi-location probing beyond hetzner+astra+macbook (third-party VPS for independent perspective)
13 - - [ ] Webhook alert channel (ntfy.sh, Pushover, generic webhook)
14 - - [ ] Prometheus/OpenTelemetry metrics export
15 - - [ ] Peer auto-discovery (mDNS/Tailscale API — currently manual config only)
16 -
17 - ---
18 -
19 - ## Key Paths
20 - - Config: `src/config.rs`, `pom.toml`
21 - - Database: `src/db.rs`
22 - - HTTP API: `src/api.rs`
23 - - Peer mesh: `src/peer.rs`
24 - - Health checks: `src/checks/http.rs`
25 - - Route checks: `src/checks/routes.rs`
26 - - TLS checks: `src/checks/tls.rs`
27 - - DNS checks: `src/checks/dns.rs`
28 - - WHOIS checks: `src/checks/whois.rs`
29 - - CI output parsing: `src/checks/parse.rs`
30 - - Test orchestration: `src/checks/ssh.rs`
31 - - MCP server: `src/tools/mod.rs`, `src/tools/health.rs`, `src/tools/tests.rs`
32 - - CLI: `src/main.rs`, `src/cli/` (mod.rs, serve.rs, status.rs, incident.rs, tasks/)
33 - - Types: `src/types.rs`
34 - - Integration tests: `tests/integration.rs`
35 - - Deploy: `deploy/` (deploy.sh, pom-hetzner.toml, pom-astra.toml, pom.service)
36 -
@@ -1,55 +0,0 @@
1 - # DocEngine Roadmap
2 -
3 - Higher-level direction for the crate. Actionable items live in `todo.md`.
4 -
5 - ## Current State (2026-05-16)
6 -
7 - - 4 presets (Permissive / Standard / Strict / Sanitize-only), builder pattern, two-phase render (pulldown-cmark → ammonia).
8 - - Optional features: `doc-loader`, `directives`, `frontmatter`, `mentions`, `quotes`, `media-urls`.
9 - - Consumers: MNW (docs, blog, descriptions), Multithreaded (forum), GoingsOn, Balanced Breakfast, audiofiles.
10 - - Tests: ~151. Mutation kill rate 77% (43 misses carried in `_meta/remediation_todo.md` § C1.x).
11 - - Zero unsafe.
12 -
13 - ## Themes
14 -
15 - ### 1. Content correctness — single source of truth
16 -
17 - The 2026-05 audit found ~150 stale-value bugs in the markdown corpus when business numbers changed (e.g., $573 → $580). The pattern repeats every time we tweak pricing, tiers, or capacity assumptions. DocEngine is the leverage point: a build-time substitution pass turns the markdown corpus into a derived artifact of `assumptions.toml` instead of a copy-paste graveyard.
18 -
19 - **Trajectory.** v1 (shipped 2026-05-16) handles scalar substitution and a fixed derived registry. The natural evolution is from "substitute named scalars" to "render a typed serde context through a small expression engine," in three stages:
20 -
21 - - **v2 — serde context.** Replace the bespoke `LookupValue` enum and ad-hoc flat lookup with a single `serde_json::Value` produced by serializing the typed `Assumptions` struct (plus a derived sibling struct). Consumer crates can register their own `Serialize` contexts so the substitution mechanism is no longer business-numbers-specific. Same `{{ dotted.path }}` syntax; arrays and nested tables become accessible natively.
22 - - **v3 — filter pipeline.** Pipe syntax for formatters: `{{ expenses.F_monthly | money }}`, `{{ stripe.percent | percent }}`, `{{ derived.fill_time_500 | round(1) }}`. Filters are registered Rust functions on a `Context` builder. Small set to start: `money`, `percent`, `int`, `round(n)`, `lower`, `upper`.
23 - - **v4 — control flow.** Loops and conditionals: `{% for tier in tiers.standard %}` ... `{% endfor %}`, `{% if !cohort.lock_duration.is_empty() %}`. Hand-rolled, not a Jinja port — grammar limited to what our docs need. Drops the "duplicated tier-comparison table" class of problem the same way v1 dropped stale scalars.
24 -
25 - **No new deps.** Option B (MiniJinja / Tera) was rejected on dep-tree grounds; option C builds the engine in-crate. Each stage stays under ~300 LoC of parser + evaluator. Backwards compatibility: v1 `{{ key }}` syntax is a strict subset of v2-v4.
26 -
27 - ### 2. Docs as a product surface
28 -
29 - Right now `DocLoader` loads once at startup and serves from a `HashMap`. That's enough for a small corpus, but two features are clearly missing: full-text search (creators can't find anything in the guide) and versioned docs (we ship breaking changes without a way to read the old behavior). Both are larger than a sprint.
30 -
31 - ### 3. Quality floor
32 -
33 - Mutation testing surfaced 43 missed mutants in docengine — most are arithmetic boundaries in `directives.rs` / `code_spans.rs` and untested entry points in `doc_loader.rs`. None block shipping; all are reachable with targeted boundary tests. Goal: ratchet to ≥90% kill rate before adding new feature surface.
34 -
35 - ### 4. Hot-path measurement
36 -
37 - DocEngine is on the MNW request path for every rendered description. There are no benchmarks. Before any perf work (`Arc<str>` for repeated fields, parser reuse, cached renders), establish a `criterion` baseline so changes are measurable.
38 -
39 - ## Out of Scope
40 -
41 - - Live preview / WYSIWYG — DocEngine is render-only. Authoring UI lives in consumers.
42 - - Custom markdown extensions beyond what GFM + post-processing covers. Directives, mentions, quotes are the extension surface; new dialects should be a separate crate or a consumer-side post-process.
43 - - Multi-language rendering (CommonMark dialects per locale). Not a real need yet.
44 -
45 - ## Sequencing
46 -
47 - 1. ~~Assumption substitution v1~~ — landed 2026-05-16.
48 - 2. Wire v1 into MNW boot pipeline + migrate corpus incrementally.
49 - 3. Substitution v2 — serde context (`Serialize`-derived structs replace `LookupValue`).
50 - 4. Substitution v3 — filter pipeline (`| money`, `| percent`, `| round(n)`).
51 - 5. Substitution v4 — control flow (`{% for %}`, `{% if %}`).
52 - 6. Mutation gap closure — small PRs alongside other work.
53 - 7. Benchmarks — prerequisite for any perf claim.
54 - 8. Full-text search — once corpus growth justifies it.
55 - 9. Versioned docs — once we have a breaking change worth documenting historically.
@@ -1,87 +0,0 @@
1 - # DocEngine Todo
2 -
3 - Actionable items. Higher-level direction in `roadmap.md`.
4 -
5 - ## Key Paths
6 -
7 - - `src/render.rs`, `src/sanitize.rs`, `src/doc_loader.rs`, `src/directives.rs`
8 - - `MNW/server/docs/internal/business/assumptions.toml` (source-of-truth for substitution)
9 - - `_meta/remediation_todo.md` § C1.x (per-crate mutation follow-ups)
10 -
11 - ## Features
12 -
13 - ### Assumption substitution — v1 deployed 2026-05-16
14 -
15 - `assumptions` feature shipped in docengine 0.3.2 and wired into MNW 0.5.20. Loads `assumptions.toml`, validates (F bounds, tier-mix sum=1, surplus-split sum=1, founding ≤ standard, rho bounds, cohort caps > 0), computes derived registry (`R_cap`, `ARPU_{founding,standard}`, `stripe_fee_{tier}_std`, `marginal_avg`, `break_even_{class}`, `surplus_{N}_{class}`, `fill_time_{N}`), substitutes `{{ dotted.path }}` markers via regex pre-pass. `DocLoaderConfig.pre_process` hook applies the substitution to every page at boot; validation failure panics startup. `guide/tiers.md` migrated (5 tier-price values). 17 docengine tests, 1,447 MNW lib tests pass.
16 -
17 - **Landed today:**
18 - - v1 substitution wired into MNW boot (DocLoader `pre_process` hook, MNW 0.5.20).
19 - - Filter pipeline with extensible `Filter` trait, 8 built-ins, closure support via blanket `Fn` impl (MNW 0.5.21).
20 - - Marginal-cost model corrected to include weighted Stripe fee on creator subs (+ chargeback EV); break-even now derived as ~28 creators (MNW 0.5.22).
21 - - CI guardrail: `MNW/server/tests/assumptions.rs` validates the file and every marker in site-docs at PR time.
22 - - Five pages migrated: `guide/tiers.md`, `about/pricing.md`, `about/guarantees.md`, `support/faq.md`, `guide/stripe.md`.
23 -
24 - Follow-ups:
25 - - **Continue corpus migration.** Next likely candidates: any remaining `$10-$60` ranges (need v4 loops for clean rendering, or hardcoded min/max derived values for now), `$1-$10,000` PWYW bounds (literals, not derivable), R_cap / runway numbers, app-sync prices, Fan+ pricing.
26 - - **Bare-value lint.** Detect stray `$580` / `$0.30` / `2.9%` outside `{{ }}` markers — extends `tests/assumptions.rs`. Trigger only on values that match a known derived/raw key to keep false positives low.
27 - - **Marginal cost — egress + support.** Current model excludes egress (fine at current scale, but flag-and-revisit when monthly egress > Hetzner's 20 TB free) and support time (A26 in assumptions.md — needs real data post-launch).
28 - - **Decide whether to also substitute the internal business docs** (`docs/internal/business/*.md`). They aren't served via DocLoader today but contain most of the audit's stale-value bugs.
29 -
30 - ### Assumption substitution v2 — serde context
31 -
32 - Replace `LookupValue` + ad-hoc flat lookup with a single `serde_json::Value` (or equivalent) produced by serializing the typed `Assumptions` struct plus a derived sibling. Introduce a `Context` trait so consumers can register their own `Serialize`-derived contexts beside `assumptions`. v1 `{{ dotted.path }}` syntax stays. Arrays and nested tables become accessible natively (`{{ tiers.standard.basic }}` already works, but `{{ stripe.connect_express.per_active_account }}` etc. comes for free from existing TOML structure).
33 -
34 - **Why over MiniJinja:** keeps the dep tree clean. Engine is small enough to in-crate.
35 -
36 - ### Assumption substitution v3 — filter pipeline (landed 2026-05-16)
37 -
38 - Pipe syntax shipped. Built-ins: `int`, `ceil`, `floor`, `round` / `round(n)`, `money` / `money("€")`, `percent` / `percent(n)`, `upper`, `lower`. Filters chain (`{{ x | round(2) | money }}`) and are extensible via `Assumptions::with_filter(name, impl Filter)` — the `Filter` trait has a blanket impl over `Fn`, so closures work as filters without a wrapper struct. Hand-rolled mini-parser handles paths, pipes, parens, and quoted-string args; pipes inside string literals are treated as text. 29 new tests (parser + each filter + chain + custom registration + error cases).
39 -
40 - ### Assumption substitution v4 — control flow
41 -
42 - Loops and conditionals so a single tier-comparison block is the source for every page that shows tier prices:
43 -
44 - ```
45 - {% for tier in tiers.standard %}
46 - - **{{ tier.name }}**: ${{ tier.price }}/mo
47 - {% endfor %}
48 - ```
49 -
50 - Hand-rolled parser/evaluator over the v3 AST. Grammar limited to `for`/`if`/`endif`/`endfor` + boolean ops on serde values. Not a Jinja port; just what the docs need.
51 -
52 - ### Full-text search
53 -
54 - In-memory index over the `DocLoader` page store. Tokenizer + posting list, no external deps. Expose `DocLoader::search(query) -> Vec<SearchHit>` with snippet extraction. Defer until corpus growth justifies it.
55 -
56 - ### Versioned docs
57 -
58 - Per-page `version` frontmatter + a `DocLoader` that loads multiple versions and routes by URL prefix (`/docs/v0.5/...`). Defer until we have a breaking change worth documenting historically.
59 -
60 - ## Quality
61 -
62 - ### Close mutation testing gaps (target ≥90% kill rate)
63 -
64 - Currently 77%, 43 missed (per `_meta/remediation_todo.md` § C1.x):
65 -
66 - - **`directives.rs` (14 missed):** `process_ui_examples` arithmetic boundaries (`+`/`-`/`*` on offsets), `strip_html_tags_simple` match arms for `<`/`>`/`!in_tag`, `process_alerts` `||` and `<=` mutations. Each needs a precise input that exercises only that branch.
67 - - **`doc_loader.rs` (12 missed):** `DocLoader::get`/`index`/`search_index`/`load`/`resolve_ui_examples`/`strip_html_tags` have no direct unit tests. Add tempdir-based fixtures (write markdown to a temp dir, instantiate `DocLoader`, assert).
68 - - **`code_spans.rs` (8 missed):** arithmetic in offset tracking inside `strip_code_spans` / `code_span_ranges` — `+=`/`+`/`*`/`<`. Need precise tests on exact byte ranges and stripped output, not just structural shape.
69 - - **`render.rs` (2 missed):** `has_dangerous_scheme` `||` boundary needs a URL where the `?`-vs-`#` path matters; `Renderer::render_raw` match guard `strip_raw_html` — toggle the builder and assert both raw-html-stripped and not-stripped cases.
70 - - **`toc.rs` (2 missed):** match guard `in_heading.is_some()` — needs a test constructing the rare nested-heading state.
71 - - **`rewrite_links` `||` chain (L244):** inner `mailto:` / `/` arms may still be uncovered.
72 -
73 - ### Benchmarks
74 -
75 - No `criterion` benches exist. DocEngine is on the MNW per-request render path. Add benches for:
76 -
77 - 1. `Renderer::render` across the four presets on a representative description (≤2 KB).
78 - 2. `Renderer::render` on a large doc page (~20 KB blog post with directives + images).
79 - 3. `DocLoader::load` startup on the current `server/site-docs/` corpus.
80 - 4. `post_process_directives` and `post_process_quotes` separately.
81 -
82 - Establish baseline before any perf refactor (`Arc<str>` for repeated fields, parser reuse, etc.). The broader "no benches in MNW server" gap is tracked separately in `MNW/server/docs/todo.md`.
83 -
84 - ## Deferred / nice-to-have
85 -
86 - - Cached render: same `(preset, input_hash)` → reuse HTML. Only worth it if benches show render is hot.
87 - - Per-preset feature gates for `directives`/`mentions`/`quotes` so consumers can't accidentally post-process strict-preset output through a permissive pipeline.
@@ -1,10 +0,0 @@
1 - # s3-storage — Todo
2 -
3 - Done: Core implementation. Active: None. Next: None planned.
4 -
5 - v0.1.0. No tests (wrapper crate, tested via MNW/MT integration tests).
6 -
7 - ## Deferred
8 -
9 - - [ ] Multipart upload support for large files
10 - - [ ] List objects / pagination
@@ -1,109 +0,0 @@
1 - # SyncKit Client SDK -- Audit History
2 -
3 - See [audit_review.md](./audit_review.md) for current scorecard and grades.
4 -
5 - ## Changes Since Last Audit
6 -
7 - ### Tenth audit (2026-04-30, Run 17 cross-project)
8 - - **Test count:** 340 (241 unit + 99 integration). 0 clippy warnings. 0 failures.
9 - - **Grade:** A (maintained). v0.3.1. ~5,945 LOC (src) + 2,742 (integration) = 8,687 total.
10 - - **Module heatmap updated** with per-file line counts. Client split into 6 submodules (mod.rs, helpers.rs, auth.rs, sync.rs, encryption.rs, blob.rs, subscribe.rs). conflict.rs at 950 LOC is the second largest module after crypto.rs (1,049 LOC).
11 - - **New cold spots:** (1) sync.rs pull duplication -- key-extraction + decrypt pattern copy-pasted across 5 pull variants (lines 112-288), a private `do_pull_inner` could eliminate ~100 LOC. (2) `fake_jwt` test helper defined identically in helpers.rs and auth.rs tests. (3) rand 0.8 not latest (0.9.x available, compatible with chacha20poly1305 0.10).
12 - - **Mandatory surprise:** `resolve_lww` semantic asymmetry -- DELETE always beats non-DELETE regardless of timestamp. Remote Delete kills local Insert even if Insert is newer. Convergent and safe, but potentially surprising. Documented in test at line 847.
13 - - **Previous items unchanged:** subscribe.rs:107 unwrap_or_default still present, rustls-webpki vulns still upstream-blocked, no key rotation mechanism still open.
14 -
15 - ### Seventh audit (2026-03-28, Run 12 cross-project)
16 - - **Test count:** 297 (197 unit + 99 integration + 1 doctest). 0 clippy warnings. 0 failures.
17 - - **Grade:** A (maintained). v0.3.0.
18 - - **No code changes since Run 9.**
19 - - **New dependency advisory:** rustls-webpki 0.103.9 (RUSTSEC-2026-0049) — upgrade to 0.103.10 via `cargo update -p rustls-webpki`.
20 - - **Mandatory surprise:** None new. Previous surprise (fresh random Argon2 salt per wrap) still valid and impressive.
21 - - **No new findings.** All previous items remain resolved.
22 -
23 - ### Rust Patterns Audit (2026-03-21)
24 - - `SessionInfo.token` changed from `String` to `Arc<String>` -- `Arc::clone` instead of String clone
25 - - Auth request structs already use `&'a str` -- confirmed optimal, no change needed
26 -
27 - ### Sixth audit (2026-03-18, Run 9 cross-project)
28 - - **Test count:** 298 (197 unit + 99 integration + 1 doctest). 0 clippy warnings.
29 - - **Grade:** A (maintained). v0.3.0.
30 - - **No new findings.** All previous items remain resolved.
31 - - **Crypto audit notes:** XChaCha20-Poly1305, Argon2id with OWASP minimums, ZeroizeOnDrop keys, NFC normalization. 100+ crypto-specific tests.
32 - - **No sensitive data in logs:** Confirmed — tracing calls log events (e.g., "Master key generated") without leaking key material or passwords.
33 - - **Mandatory surprise:** Argon2 salt uniqueness — every wrap generates fresh random salt (crypto.rs:130-131). Verified by `two_wraps_use_different_salts` test. Correct design, uncommon rigor.
34 -
35 - ### Concurrency Upgrade (2026-03-13)
36 - - **Concurrency:** B+ -> A-
37 - - Replaced std::sync::RwLock with parking_lot::RwLock. Removed 16 poison-handling .map_err() sites. All 234 tests pass.
38 -
39 - ### Second audit (2026-03-13, pre-launch skeptical lens)
40 - - **Grade:** B+ (maintained). S4 fixes resolved 4/6 first-audit issues. New critical finding: blob data not encrypted.
41 - - **Test count:** 13 -> 109 (+96 tests, mostly from S4 remediation)
42 - - **S4 fixes:** await_holding_lock, HTTP timeouts, retry with backoff, token expiry detection, client.rs tests (66), keystore.rs tests (18), ChangeOp enum
43 - - **New findings:** Blob encryption gap (CRITICAL), no key rotation, Mutex unwraps, master key copies not zeroized, public types that should be pub(crate)
44 - - **Deterministic Argon2 salt:** Persists from first audit (tracked in MNW todo as "consider random salt")
45 -
46 - ### Post-audit remediation (2026-03-13)
47 - - **Grade:** B+ -> A-. 5 of 6 new findings from second audit resolved. Only key rotation deferred.
48 - - **Test count:** 109 -> 118 (+9 tests: 7 blob encrypt/decrypt, 2 salt tests)
49 - - **Blob encryption:** encrypt_bytes/decrypt_bytes in crypto.rs. blob_upload/blob_download encrypt/decrypt transparently using master key. 40-byte overhead (24 nonce + 16 tag).
50 - - **Random Argon2 salt:** wrap_master_key generates random 32-byte salt per operation. unwrap_master_key reads salt from envelope. Eliminates deterministic salt precomputation risk.
51 - - **Previous S4 fixes verified:** Mutex unwraps, ZeroizeOnDrop, pub(crate) restrictions -- all still in place.
52 - - **Key rotation:** Deferred post-beta. Requires server-side re-encryption of all sync_log entries.
53 - - Documentation upgraded to A: Device/SyncStatus/ChangeEntry/BlobUploadUrlResponse field docs added. All 12 error variants documented with when-they-occur. Keystore platform behavior documented (macOS/Linux/Windows backends). Client helpers documented (require_token, require_session_ids, etc). SessionInfo field docs. client.rs module doc expanded to 50 lines. architecture.md created (217 lines), README created (78 lines).
54 -
55 - ### Observability Upgrade (2026-03-13)
56 - - Added Observability dimension to scorecard (grade A)
57 - - Added 16 `#[instrument]` annotations to all pub async methods in client.rs with appropriate skip params
58 - - Sensitive params skipped: password, old_password, new_password, email, code, code_verifier, presigned_url, data
59 - - `use tracing::instrument;` import added to client.rs
60 - - `cargo check` passes clean
61 -
62 - ### Performance Upgrade (2026-03-13)
63 - - **Performance:** B -> A-
64 - - Cached JWT `exp` claim in Session struct — `require_token()` and `is_token_expired()` no longer re-parse the JWT on every call
65 - - Retry request bodies use `bytes::Bytes` instead of `Vec<u8>` — clone in retry closures is O(1) refcount bump, not O(n) copy (10 sites)
66 - - Batch encrypt/decrypt in `push()`/`pull()` extracts master key once before the loop instead of per-entry lock acquisition
67 -
68 - ### Resilience Upgrade (2026-03-13)
69 - - **Resilience:** B- -> A-
70 - - Added 9 integration tests for encryption setup flows: `setup_encryption_new` (happy path, no-auth, server retry), `setup_encryption_existing` (happy path, wrong password, no-auth, server retry, missing key 404), cross-device encryption roundtrip
71 - - Cross-device test proves full two-device flow: device 1 creates encryption → pushes data → device 2 recovers key → pulls and decrypts successfully
72 - - **Test count:** 234 -> 243 (+9 integration tests). 170 unit + 72 integration + 1 doctest.
73 -
74 - ### Adversarial Test Audit (2026-03-13)
75 - - **Grade:** A- -> A-. Testing grade upgraded from B- to A.
76 - - **Test count:** 150 -> 234 (+84 tests: 52 unit, 32 integration). Test density ~94 tests/KLOC.
77 - - **CRITICAL fix: change_password bypass** -- Old password verification skipped when master key was cached in memory. Attacker with session access (stolen device, malware) could change encryption password without knowing the old one. Fixed: always verify old password against server envelope regardless of cache state. Added 8 tests covering cache hit/miss, wrong old password, concurrent password changes.
78 - - **HIGH fix: Unicode password normalization** -- NFC vs NFD normalization inconsistency across operating systems could derive different keys from "same" password (e.g., é as single codepoint vs e+combining-acute). Added `unicode-normalization` crate, NFC normalization before all key derivation (wrap_master_key, unwrap_master_key, change_password). 4 tests covering NFC/NFD/mixed inputs.
79 - - **Empty password rejection** -- wrap_master_key, unwrap_master_key, change_password now return error on empty password. 3 tests.
80 - - **Password length limits** -- 1024-byte max after UTF-8 encoding. Prevents resource exhaustion on Argon2 (linear memory cost with input length). 2 tests.
81 - - **Comprehensive crypto tests** -- Tamper detection (flip bits in nonce/ciphertext/tag), envelope validation (version mismatch, truncated fields), key rotation simulation (decrypt with wrong key), concurrent encryption (nonce uniqueness under load), large payload handling (1MB encrypt/decrypt). 28 new crypto unit tests.
82 - - **Integration tests** -- Error mapping for all 4xx/5xx codes (400/401/403/404/409/413/429/500/502/503), retry behavior (transient vs permanent), auth enforcement (missing token, invalid token, expired token), blob roundtrips (upload -> download, tamper detection, decrypt failure), malformed response handling (invalid JSON, missing fields). 32 new integration tests.
83 - - **Concurrency tests** -- Parallel encrypt operations (nonce uniqueness), concurrent password changes (last-write-wins, cache invalidation), device registration race (409 conflict), push/pull interleaving (optimistic locking). 9 tests across unit and integration.
84 - - **Resolved findings:** All 2 critical vulnerabilities from adversarial audit fixed. No new security issues discovered.
85 -
86 - ### Third audit (2026-03-16, Run 6 cross-project)
87 - - **Test count:** 297 (unchanged)
88 - - **Grade:** A (maintained).
89 - - **Source LOC:** 4,327 src + 2,749 test
90 - - **New finding (MEDIUM):** Wrapping key in `crypto.rs:99` (`derive_wrapping_key`) is computed on the stack but not wrapped in `ZeroizeOnDrop`. Intermediate key material sits in memory after function returns. Other keys properly use ZeroizeOnDrop.
91 - - **New finding (LOW):** Unused `sha2` dependency in Cargo.toml.
92 - - **Mandatory surprise:** Wrapping key not zeroized — Genuine issue (MEDIUM).
93 - - **Previous items verified:** All previous remediated items confirmed intact. Key rotation still deferred (post-beta).
94 -
95 - ### Testing Push (2026-03-13)
96 - - **Grade:** A- -> A. Testing A -> A+. Code Quality, Type Safety, Concurrency, Resilience all upgraded to A.
97 - - **Test count:** 243 -> 297 (+54 tests). 197 unit + 99 integration + 1 doctest.
98 - - **types.rs:** 13 unit tests added. Serde roundtrip, Display/serde consistency, from_str_opt edge cases, Copy/Hash trait verification, skip_serializing_if, extra unknown fields tolerance, ISO timestamp deserialization.
99 - - **error.rs:** 10 unit tests added. Send+Sync compile-time assert, Display for all 8 variants, Debug no-panic, source() chain verification (Json, Base64 have source; leaf variants do not), empty/very-long server messages.
100 - - **client.rs:** 4 unit tests added. SyncKitClient Send+Sync compile-time assert, with_http_client constructor, unicode table name encrypt/decrypt roundtrip, empty row_id roundtrip.
101 - - **Integration tests:** 27 new. Retry count verification (exhaustion at 4 requests, 404 not retried, 3rd-attempt success). Malformed responses (HTML body, empty body, missing has_more, wrong cursor type, missing app_id, missing already_exists, 413 error, extra fields ignored). Session edge cases (double authenticate, clear then re-auth, expired token on restore). Encryption setup overwrite. Blob edge cases (confirm retry, download retry, 1MB upload overhead). Device edge cases (empty name, unicode name, empty list). Concurrency stress (50 concurrent session_info reads, 50 has_master_key reads, 20 status checks, 4x100-entry pushes). Timeout tests (slow server timeout, retry after timeout).
102 - - **New constructor:** `with_http_client(config, client)` enables custom timeout testing without modifying production defaults.
103 -
104 - ### Performance Upgrade (2026-03-13)
105 - - **Performance:** A- -> A
106 - - Pre-built endpoint URLs: new `Endpoints` struct computes all 10 API endpoint URLs once at client construction, eliminating per-request `format!()` string allocations
107 - - `Arc<String>` session token: `require_token()` returns `Arc<String>` instead of `String`, making per-request token extraction O(1) refcount bump instead of O(n) string clone (~300-500 byte JWT)
108 - - `key_url_and_token()` returns `(&str, Arc<String>)` instead of `(String, String)`, zero allocations per call
109 - - All 297 tests pass unchanged (2 test assertions updated for Arc deref)
@@ -1,123 +0,0 @@
1 - # SyncKit Client SDK Audit Review
2 -
3 - - Last audited: 2026-04-30 (tenth audit, Run 17 cross-project)
4 - - Previous audit: 2026-04-18 (ninth audit, Run 15 cross-project)
5 - - Crate: `synckit-client` v0.3.1
6 - - Path: `MNW/shared/synckit-client/`
7 -
8 - ## Overall Grade: A
9 -
10 - Run 17: 340 tests (241 unit + 99 integration). 0 clippy warnings. v0.3.1. ~5,945 LOC (src) + 2,742 (integration) = 8,687 total. Grade stable at A. Minor issues only: subscribe.rs:107 unwrap_or_default masks server errors, rustls-webpki vulns (upstream-blocked), no key rotation mechanism.
11 -
12 - ## Scorecard
13 -
14 - | Dimension | Grade | Notes |
15 - |-----------|-------|-------|
16 - | Code Quality | A | Zero unwraps in production code (1 expect on infallible Client::build) |
17 - | Architecture | A | Clean module boundaries. Public API minimal. Wire types pub(crate). |
18 - | Testing | A | 340 tests for 5,945 LOC = 57.2 tests/KLOC. 54 crypto tests alone. |
19 - | Security | A+ | XChaCha20-Poly1305, Argon2id (OWASP params), NFC normalization, zeroize, envelope versioning |
20 - | Performance | A- | Pre-computed endpoints. Minor: 5x duplicated pull pattern. |
21 - | Documentation | A | Module-level docs on every file. Crypto rationale documented. |
22 - | Dependencies | A- | All deps latest stable major. rand 0.8 not latest (0.9 exists). |
23 - | Type Safety | A | Proper enums, ZeroizeOnDrop wrapper, pub(crate) on wire types. |
24 - | Concurrency | A | parking_lot::RwLock, never held across .await. Send+Sync asserted. |
25 - | Resilience | A | 3-retry exponential backoff, Retry-After support. Token expiry pre-flight. SSE buffer bounded. |
26 - | Codebase Size | A | 5,945 LOC lean for feature set. Pull variants refactored to shared `pull_inner` helper (2026-05-01). |
27 -
28 - ## Module Heatmap
29 -
30 - | Module | File | LOC | Tests | Grade | Notes |
31 - |--------|------|-----|-------|-------|-------|
32 - | crypto | `src/crypto.rs` | 1,049 | 54 | A+ | Excellent coverage. Roundtrip, wrong-key, version, truncation, uniqueness, zeroize. Binary blob encrypt/decrypt. Random salt wrapping. |
33 - | client/mod | `src/client/mod.rs` | 560 | 28 | A | Auth state, encrypt/decrypt roundtrip, type serialization, error classification, token expiry, OAuth URL construction. Send+Sync assert. |
34 - | client/helpers | `src/client/helpers.rs` | 776 | 33 | A | Pre-built endpoints, token management, session helpers. |
35 - | client/auth | `src/client/auth.rs` | 428 | 16 | A | OAuth flows, token refresh, session management. |
36 - | client/sync | `src/client/sync.rs` | 414 | 18 | A | Push/pull operations. Pull duplication resolved via `pull_inner` helper. |
37 - | client/encryption | `src/client/encryption.rs` | 233 | 4 | A | Encryption setup, key wrapping, password change. |
38 - | client/blob | `src/client/blob.rs` | 181 | 3 | A- | Blob upload/download with E2E encryption. |
39 - | client/subscribe | `src/client/subscribe.rs` | 166 | 4 | A- | SSE subscription. unwrap_or_default at line 107 masks server errors. |
40 - | conflict | `src/conflict.rs` | 950 | 36 | A | LWW conflict resolution. Semantic asymmetry: DELETE beats non-DELETE regardless of timestamp. |
41 - | types | `src/types.rs` | 500 | 17 | A | Serde roundtrip, Display/serde consistency, from_str_opt rejection, Copy/Hash traits, skip_serializing_if, extra field tolerance, ISO timestamp parsing. |
42 - | error | `src/error.rs` | 193 | 10 | A | Send+Sync assert, Display for all variants, Debug no-panic, source() chain. |
43 - | keystore | `src/keystore.rs` | 318 | 18 | A- | Service name construction, base64 roundtrips, length validation, error handling. Platform behavior documented. |
44 -
45 - ## Cold Spots
46 -
47 - 1. ~~**sync.rs pull duplication**~~ -- Fixed 2026-05-01. Extracted `pull_inner` generic helper. 4 public methods now delegate to shared implementation.
48 - 2. **fake_jwt test helper duplication** -- Defined identically in helpers.rs and auth.rs tests.
49 - 3. **rand 0.8** -- 0.9.x available. Not urgent (chacha20poly1305 0.10 compatible).
50 -
51 - ### Carried forward
52 -
53 - 4. **subscribe.rs:107 unwrap_or_default** -- Masks server errors during SSE event parsing. If the server sends a malformed event, the client silently produces a default value instead of propagating the error.
54 - 5. **rustls-webpki vulns** -- Transitive dependency advisories. Blocked on upstream.
55 - 6. **No key rotation mechanism** -- No way to rotate the master key without re-encrypting all data.
56 -
57 - ## Strengths
58 -
59 - - **Crypto module is excellent** -- XChaCha20-Poly1305 with 192-bit random nonces eliminates nonce collision risk. Argon2id with OWASP-minimum parameters. Random salt per operation.
60 - - **Minimal API surface** -- 8 types re-exported from lib.rs. Wire-only types are `pub(crate)`. Internal helpers are private.
61 - - **Clean key hierarchy** -- Three layers (password -> wrapping key -> master key -> per-entry encryption) well-documented and correctly implemented.
62 - - **No consumer-specific logic** -- Fully generic. No references to GO, BB, or AF. Table names and data shapes are opaque to the SDK.
63 - - **ZeroizeOnDrop** -- In-memory keys are zeroed on drop via volatile writes.
64 - - **Comprehensive test suite** -- 340 tests at ~57 tests/KLOC. Coverage across all modules including adversarial inputs.
65 -
66 - ## Weaknesses
67 -
68 - - **No key rotation mechanism** -- No way to rotate the master key without re-encrypting all data.
69 - - **subscribe.rs:107 unwrap_or_default** -- Masks server errors during SSE event parsing.
70 - - **rustls-webpki transitive vulns** -- Blocked on upstream.
71 -
72 - ## Action Items
73 -
74 - ### Previous (carried forward)
75 - - subscribe.rs:107 unwrap_or_default: Still present (unchanged)
76 - - rustls-webpki vulns: Still upstream-blocked
77 - - No key rotation mechanism: Still open (architectural decision)
78 -
79 - ### New (Run 17)
80 - - ~~[LOW] Refactor pull variants in sync.rs to share a private `pull_inner` helper~~ -- Done 2026-05-01
81 - - [LOW] Extract `fake_jwt` test helper to shared `#[cfg(test)]` module
82 - - [LOW] Track rand 0.9 upgrade (currently compatible with chacha20poly1305 0.10)
83 -
84 - ### Resolved (previous audits)
85 - - ~~**CRITICAL: Blob data NOT encrypted**~~ -- RESOLVED
86 - - ~~**Deterministic Argon2 salt**~~ -- RESOLVED
87 - - ~~**13 Mutex `.unwrap()` calls**~~ -- RESOLVED
88 - - ~~**Master key copies not zeroized**~~ -- RESOLVED
89 - - ~~**Several public types that should be `pub(crate)`**~~ -- RESOLVED
90 - - ~~**client.rs untested**~~ -- RESOLVED
91 - - ~~**No resilience**~~ -- RESOLVED
92 - - ~~**change_password concurrency bug**~~ -- RESOLVED
93 - - ~~**op field raw String**~~ -- RESOLVED
94 -
95 - ## Mandatory Surprise
96 -
97 - **The `resolve_lww` function has a deliberate semantic asymmetry: DELETE always beats non-DELETE regardless of timestamp.** A remote Delete can kill a local Insert even if the Insert is newer. This is convergent and safe (both peers agree on the outcome), but potentially surprising to users who expect timestamp to be the sole tiebreaker. Documented in test at line 847 (`lww_remote_delete_beats_local_insert`).
98 -
99 - ### Previous Surprises
100 -
101 - - **subscribe.rs:107 unwrap_or_default masks server errors** -- Still present. SSE event handler silently produces default on malformed JSON. Low severity (E2E encryption catches tampering).
102 - - **change_password race + deterministic salt** -- Both resolved. Lock guard dropped before await. Random salt per operation.
103 -
104 - ## Metrics Over Time
105 -
106 - | Date | LOC | Files | Tests | Tests/KLOC | Clippy | Expects | Grade |
107 - |------|-----|-------|-------|------------|--------|---------|-------|
108 - | 2026-03-11 | 1,416 | 6 | 13 | 9.2 | -- | -- | -- |
109 - | 2026-03-13 | ~1.4K | 6 | 109 | ~77 | -- | -- | -- |
110 - | 2026-03-13 (post-fix) | ~1.5K | 6 | 118 | ~79 | -- | -- | -- |
111 - | 2026-03-13 (adversarial) | ~2.5K | 6 | 234 | ~94 | -- | -- | -- |
112 - | 2026-03-13 (perf+resilience) | ~2.6K | 6 | 243 | ~94 | -- | -- | -- |
113 - | 2026-03-13 (testing push) | ~2.8K | 6 | 297 | ~106 | -- | -- | -- |
114 - | 2026-03-16 (Run 6) | 4,327 | 6 | 297 | ~69 | -- | -- | A |
115 - | 2026-03-18 (Run 9) | 4,327 | 6 | 298 | ~69 | -- | -- | A |
116 - | 2026-03-28 (Run 12) | 4,327 | 6 | 297 | ~69 | -- | -- | A |
117 - | 2026-04-15 (Run 14) | ~5,426 | -- | 327 | ~60 | -- | -- | A |
118 - | 2026-04-18 (Run 15) | ~5,426 | -- | 327 | ~60 | -- | -- | A |
119 - | 2026-04-30 (Run 17) | ~5,945 | -- | 340 | ~57 | 0 | 1 | A |
120 -
121 - ---
122 -
123 - See [audit_history.md](./audit_history.md) for full chronological audit log.
@@ -1,142 +0,0 @@
1 - # SyncKit Client SDK -- Competitive Analysis
2 -
3 - Last updated: 2026-04-10
4 -
5 - ## Positioning
6 -
7 - SyncKit is an E2E encrypted, Rust-native, offline-first sync SDK for indie desktop and mobile apps. The server (hosted on MNW) stores only encrypted blobs -- zero-knowledge by design. Bundled with OTA updates and device management. Consumers: GoingsOn, Balanced Breakfast, audiofiles (all Tauri apps).
8 -
9 - The key differentiators are server-zero-knowledge encryption (XChaCha20-Poly1305 + Argon2id, keys never leave the device), opaque-blob storage (bring-your-own-schema, no server-side migrations), and the bundled OTA + device management layer that no sync competitor offers. Pricing is bundled with MNW creator tiers ($10-60/mo), not per-read/write metered.
10 -
11 - ## Pricing Comparison
12 -
13 - | Tool | Price | Model |
14 - |------|-------|-------|
15 - | **SyncKit** | $10-60/mo (bundled) | Included in MNW creator tier |
16 - | Firebase Firestore | Pay-per-use | $0.18/100K reads+writes, $0.26/GB |
17 - | Supabase | $0-$599/mo | Freemium + usage overages |
18 - | PowerSync | $0-$599/mo | Usage-based (GB synced) |
19 - | ElectricSQL | Pay-per-write | $1/M writes, reads free |
20 - | Turso | $0-$417/mo | Storage-based tiers |
21 - | Convex | $0-$25/member/mo | Freemium + usage overages |
22 - | Ditto | Enterprise (custom) | Sales-driven |
23 - | Couchbase Mobile | Enterprise (~25K+ EUR/yr) | License-based |
24 - | Etebase | Free (self-host) | Source-available, hosted beta |
25 -
26 - ## Feature Matrix
27 -
28 - | Feature | SyncKit | Firebase | Supabase | PowerSync | ElectricSQL | Ditto | Etebase |
29 - |---------|:-------:|:--------:|:--------:|:---------:|:-----------:|:-----:|:-------:|
30 - | E2E encrypted | Y | N | N | N | N | N | Y |
31 - | Server-zero-knowledge | Y | N | N | N | N | N | Y |
32 - | Rust SDK (native) | Y | N | N | Alpha | Y | Y | Y |
33 - | Tauri integration | Y | N | N | Alpha | N | N | N |
34 - | Offline-first | Y | Partial | N | Y | Partial | Y | Y |
35 - | Bring-your-own-schema | Y | N | N | N | N | N | Partial |
36 - | OTA updates | Y | N | N | N | N | N | N |
37 - | Device management | Y | N | N | N | N | N | N |
38 - | OS keychain storage | Y | N | N | N | N | N | N |
39 - | Blob/file sync | Y | Y | Y | N | N | N | Y |
40 - | Self-hostable | Y | N | Y | Y | Y | Y | Y |
41 - | Real-time push | N | Y | Y | Y | Y | Y | N |
42 - | P2P sync (no server) | N | N | N | N | N | Y | N |
43 - | CRDT conflict resolution | N | N | N | N | N | Y | N |
44 - | Rich query engine | N | Y | Y | Y | Y | Y | N |
45 -
46 - ## Competitor Deep Dives
47 -
48 - ### 1. Firebase (Google)
49 -
50 - Managed BaaS with Realtime Database and Firestore. Massive ecosystem (Auth, Functions, Hosting, Analytics). Generous free tier. Near-instant real-time push via persistent connections. No native Rust SDK (community crates are server-side only, not offline-capable).
51 -
52 - **What SyncKit lacks:** real-time push subscriptions, multi-platform mobile SDKs, hosted auth, serverless functions, web dashboard. **What Firebase lacks:** E2E encryption, Rust SDK, Tauri support, offline desktop sync, OTA updates, device management, data portability (complete vendor lock-in, no self-hosting).
53 -
54 - ### 2. Supabase
55 -
56 - Open-source Firebase alternative on PostgreSQL. Full SQL power, RLS for access control, self-hostable. Growing ecosystem. Realtime via Postgres CDC. No offline-first without PowerSync add-on.
57 -
58 - **What SyncKit lacks:** SQL query engine, built-in auth, edge functions, web dashboard, large community. **What Supabase lacks:** E2E encryption, offline-first (requires PowerSync add-on), Rust SDK, Tauri support, OTA updates, device management.
59 -
60 - ### 3. PowerSync -- Primary Threat
61 -
62 - Offline-first sync layer between your existing database and client-side SQLite. **Released a Tauri SDK (alpha, March 2026)** built on a Rust SDK. Works with Postgres, MongoDB, MySQL, SQL Server. Self-hostable Open Edition.
63 -
64 - **What SyncKit lacks:** multi-database source support, client-side SQL queries, partial replication (sync rules), larger team and community. **What PowerSync lacks:** E2E encryption (sync service sees all data), OTA updates, device management, blob/file sync, OS keychain integration. Write-path goes directly to your backend -- PowerSync does not handle write conflicts.
65 -
66 - PowerSync is the most direct competitor. If they add encryption, they become serious competition. Their Tauri SDK being alpha-quality is a window.
67 -
68 - ### 4. ElectricSQL
69 -
70 - Postgres CDC engine streaming "shapes" (filtered table subsets) to clients. Read-path only -- writes go through your own API. Open-source (Apache 2.0). Innovative pricing: writes cost money, reads/fan-out are free and unlimited. Rust client available.
71 -
72 - **What SyncKit lacks:** read-path fan-out, per-shape subscriptions, 10-language client support. **What ElectricSQL lacks:** E2E encryption, offline-first writes (no local write queue built in), OTA updates, device management, conflict resolution (your problem), blob sync.
73 -
74 - ### 5. Ditto
75 -
76 - Enterprise P2P sync with Bluetooth/WiFi Direct mesh networking. Rust core. CRDT-based automatic conflict resolution. $82M raised (March 2025). Targets airlines, military, retail.
77 -
78 - **What SyncKit lacks:** P2P mesh sync, CRDT conflict resolution, enterprise support. **What Ditto lacks:** E2E application-layer encryption, indie pricing (enterprise sales only), OTA updates, bring-your-own-schema (CRDTs need structure).
79 -
80 - ### 6. Couchbase Lite + Sync Gateway
81 -
82 - Enterprise mobile database with bidirectional sync. Battle-tested in large deployments. Gained momentum from MongoDB Realm shutdown (Sept 2025). Configurable conflict handlers. P2P sync between Couchbase Lite instances.
83 -
84 - **What SyncKit lacks:** P2P sync, rich on-device query engine, enterprise track record. **What Couchbase lacks:** E2E encryption, indie pricing (~25K EUR/yr), Rust SDK (experimental C bindings only), simplicity (multi-component architecture), OTA updates.
85 -
86 - ### 7. Etebase -- Philosophical Peer
87 -
88 - The only other E2E encrypted sync SDK with a Rust library. Open-source server, self-hostable. SDKs for Rust, JS, Java/Kotlin, Python, C, C#. Used by EteSync (contacts/calendar sync).
89 -
90 - **What SyncKit lacks:** broader language coverage (6 languages vs 1). **What Etebase lacks:** Tauri integration, OTA updates, device management, OS keychain, blob support via presigned URLs, commercial backing, community momentum (very small team, unclear trajectory).
91 -
92 - ### 8. Realm / Atlas Device Sync (MongoDB) -- Shut Down
93 -
94 - End-of-life as of September 30, 2025. MongoDB deprecated all Atlas Device SDKs. Developers displaced into Couchbase, Ditto, PowerSync, and ObjectBox. The shutdown created a significant gap in the offline-first sync market.
95 -
96 - ### 9. Others
97 -
98 - **Turso:** Edge SQLite replication. Read replicas only, writes go to primary. Cheap ($5/mo) but not a multi-device sync solution -- no bidirectional sync, no offline writes.
99 -
100 - **Convex:** Reactive backend with automatic query subscriptions. No offline support (requires internet). Recently open-sourced (BSL, converts to Apache 2.0 after 3 years). Rust client available but secondary to TypeScript.
101 -
102 - **CouchDB/PouchDB:** Document-oriented database with built-in sync protocol. Offline-first, conflict handling via revision trees. No E2E encryption. Mature but aging. JavaScript-focused.
103 -
104 - **Syncthing:** P2P file sync. E2E encrypted, no central server. Designed for folder/file sync, not structured app data. No changelog-based sync, no SDK API, no conflict resolution for structured data.
105 -
106 - **CRDT libraries (Automerge, Yjs, Loro):** Building blocks for conflict-free merge, not sync services. Handle data structure merging; bring-your-own transport/storage/auth. Incompatible with SyncKit's zero-knowledge model (server cannot merge what it cannot read).
107 -
108 - ## What We Offer That Competitors Don't
109 -
110 - - **Server-zero-knowledge** -- the server stores only encrypted blobs. No data breaches because there is no data to breach. Compliance-friendly (GDPR, NIS2).
111 - - **Bring-your-own-schema** -- table names, row IDs, and data shapes are opaque to the server. No server-side migrations when your app schema changes.
112 - - **Bundled OTA updates** -- Tauri-compatible auto-update protocol. No competitor offers sync + OTA in one SDK.
113 - - **Bundled device management** -- register, list, deregister devices. Track sync state per device.
114 - - **OS keychain integration** -- encryption keys stored in macOS Keychain, Linux Secret Service, or Windows Credential Manager. Key material never touches disk.
115 - - **Minimal blob overhead** -- binary files encrypted with only 40 bytes overhead (24-byte nonce + 16-byte auth tag). No base64 expansion.
116 - - **Key zeroization** -- `ZeroizeOnDrop` on all key material. No key residue in memory after use.
117 - - **Flat pricing** -- included in MNW creator tier. No per-read/write metering, no surprise bills.
118 -
119 - ## Market Tailwinds
120 -
121 - - **MongoDB Realm shutdown (Sept 2025)** displaced developers seeking offline-first sync alternatives
122 - - **Tauri adoption growing ~55% YoY**, creating demand for Rust-native backends
123 - - **Regulatory pressure (GDPR, NIS2)** pushing toward E2E encryption and data minimization
124 - - **Local-first movement** gaining mainstream traction (Notion, Linear, Figma adopting offline-first)
125 - - **PowerSync Tauri SDK is alpha** -- their Rust/Tauri story is immature, giving SyncKit a window
126 -
127 - ## Target Users
128 -
129 - - Indie developers building Tauri desktop apps who need cloud sync without running a backend
130 - - Developers who prioritize user privacy and want zero-knowledge sync by default
131 - - Small teams shipping cross-platform apps (macOS/Windows/Linux) that need offline-first data
132 - - Anyone displaced from MongoDB Realm looking for a simpler, encrypted alternative
133 -
134 - ## Gaps and Potential Roadmap Items
135 -
136 - Based on what competitors offer that SyncKit does not:
137 -
138 - - **Real-time push notifications** -- Firebase/Supabase/Convex push changes instantly. SyncKit is pull-based (clients poll). A lightweight SSE channel for "something changed, pull now" would close this gap without compromising E2E encryption (the notification carries no data, just a signal).
139 - - **Selective sync / sync rules** -- PowerSync and ElectricSQL let clients sync subsets of data. SyncKit syncs the full changelog. For apps with large datasets, filtered sync (by device, by date range, by collection) would reduce bandwidth and latency.
140 - - **Conflict resolution helpers** -- Ditto and Couchbase offer configurable merge strategies. SyncKit leaves conflict resolution to the client. A toolkit of common strategies (LWW, field-level merge, custom resolver callback) in the SDK would reduce boilerplate.
141 - - **Web client (WASM)** -- every major competitor has a JavaScript/TypeScript SDK. A WASM-compiled SyncKit client would open the web platform. Low priority (current consumers are all desktop), but relevant if any consumer app ships a web companion.
142 - - **Multi-language SDKs** -- Etebase covers 6 languages, PowerSync covers 10+. SyncKit is Rust-only. A C FFI layer would enable bindings for Swift, Kotlin, Python, and JS. Only worth doing if non-Tauri consumers appear.
@@ -1,39 +0,0 @@
1 - # SyncKit Client SDK — Todo
2 -
3 - Done: All phases (S1-S5), key rotation, fuzz remediations (Runs 1-2). Active: None. Next: Post-beta items below.
4 -
5 - v0.3.1. Audit grade A. 340 tests (249 unit + integration). Rust 2024 edition (2026-05-06). rand 0.9.
6 -
7 - ---
8 -
9 - ## Business Sustainability — Remaining
10 -
11 - ### Pricing & Revenue
12 - - [ ] Implement developer application flow — short form (app description, expected usage), manual approval, 14-day unbilled trial starts on approval.
13 - - [ ] Cancellation grace period — decide whether `past_due` subscribers can still sync (probably yes, for a few days). After cancellation, sync stops at billing period end, data retained 30 days before cleanup.
14 - - [ ] Subscription management UI on makenot.work dashboard — show active app sync subscriptions with cancel/change-tier options.
15 - - [ ] Test end-to-end against live Stripe — subscribe via each app, verify webhook, verify gate.
16 -
17 - ### Operational Prerequisites (before first external customer)
18 - - [ ] Separate VPS for SyncKit — $5-30/mo depending on tier. Prevents resource contention. Technical spec at `server/docs/internal/strategy/synckit_vps_separation.md`. Trigger: first external developer accepted.
19 -
20 - ### Documentation & Accuracy
21 - - [ ] Monitor Turso and PowerSync — track whether either ships E2E encryption or flat pricing (SyncKit's main differentiators).
22 -
23 - ---
24 -
25 - ## Deferred (Post-Beta)
26 -
27 - - [ ] **OTA publish module** (`src/client/ota.rs`) — typed client for the server's existing OTA endpoints (`/api/sync/ota/apps/{app_id}/releases`, `/artifacts`, presigned upload, updater verify). Backs the `mnw ota publish` subcommand. Replaces `MNW/server/deploy/ota-publish.sh`. Full plan in `MNW/mnw-cli/docs/todo.md` § OTA Publish Subcommand. Estimate: ~4 hours including tests.
28 - - [ ] Conflict resolution helpers — LWW, field-level merge, custom resolver callback in the SDK. Reduces client-side boilerplate. (Gap vs Ditto, Couchbase)
29 - - [ ] WASM web client — compile SyncKit to WASM for browser use. Only if a consumer app ships a web companion.
30 - - [ ] C FFI layer — enables Swift/Kotlin/Python bindings. Only if non-Tauri consumers appear.
31 -
32 - ---
33 -
34 - ## Key Paths
35 - - Client: `MNW/shared/synckit-client/src/client/` (mod, auth, encryption, sync, rotation, subscribe, blob, helpers)
36 - - Crypto: `MNW/shared/synckit-client/src/crypto.rs`
37 - - Types: `MNW/shared/synckit-client/src/types.rs` (includes PullFilter, FilteredPullRequest)
38 - - Keystore: `MNW/shared/synckit-client/src/keystore.rs`
39 - - Tests: `MNW/shared/synckit-client/tests/integration.rs`