# Scan Pipeline Audit and Redesign Status: Draft, 2026-05-24. Author: Max + Claude. Not yet implemented. This document is the design predicate for a full redesign of MNW's file-scanning pipeline. It is the result of discovering that every upload on the platform since 2026-05-10 has been silently held at `held_for_review` because the MalwareBazaar layer fails closed on a missing API key — and there was no admin-visible signal that anything was wrong. The shipped pipeline is structurally fragile in ways that are not unique to that one bug. This document re-derives what the pipeline should look like, picks the layer set, fixes the fail-closed-by-default policy, moves scanning off the upload critical path, and specifies an admin surface that would have caught the MalwareBazaar regression on day one. No code changes accompany this document. Implementation is sequenced separately once the design is reviewed. --- ## 1. Goals and non-goals **Goals** - A scan pipeline a one-person platform can defend in public and point to as a trust differentiator. - Multi-signal detection that survives the loss of any single layer. - No silent platform-wide outage when a third-party scan endpoint changes behavior. - Honest, transparent communication to creators and downloaders about what is scanned and what is found. - Zero per-call vendor lock-in to a hyperscaler-owned threat-intel platform. **Non-goals** - Best-in-class detection of nation-state malware. That bar belongs to enterprise SOCs with seven-figure budgets. We aim for "covers ~90% of real-world malware on a creator-uploaded distribution platform" — explicitly bounded. - Real-time dynamic-analysis sandboxing for every upload. Cost-prohibitive at this stage; reserved for manual review of flagged samples. - Detection of supply-chain compromise inside dependencies of an uploaded binary. Out of scope for a file scanner; addressed separately by reproducible-build verification and creator-attested provenance, future work. ## 2. Threat model Adversaries, in priority order of damage to the platform: 1. **Compromised creator account pushing a poisoned update.** A long-standing creator's account is taken over. Attacker uploads a new version of an existing app with malware bundled in. Existing customers auto-update or download a "trusted" creator's release. Highest reputational damage because creator trust is the platform's load-bearing asset. 2. **Malicious-actor signup uploading malware as a free or cheap "tool".** Attacker creates a fresh account, lists a fake app, hopes a few downloads happen before takedown. Lower individual blast radius but easier to attempt at volume. 3. **Legitimate creator uploads benign software that triggers a false positive.** Crypto wallets, system utilities, low-level audio tools, and AppImages all routinely false-positive on signature AVs. Creator-facing UX must handle this gracefully — visible status, clear appeal path, no silent quarantine. 4. **Account-takeover update path** (subset of (1)): even if the new binary is signed with the creator's existing Apple Dev ID or Authenticode cert, an attacker with cert access can re-sign. Defense relies on out-of-band signals: new-device login, IP reputation, version-velocity anomalies, creator email confirmation on new releases. 5. **Novel zero-day** that no engine in our stack recognizes. Unavoidable in the general case. Mitigation: hash-reputation comparison over time (re-scan), public scan-result transparency so other downloaders can flag, and an admin review queue that surfaces anything the auto-stack can't classify. 6. **Embedded malicious URLs in otherwise benign documentation, license files, or app text.** Lower priority but cheap to defend via URL reputation lookups on extracted strings. The dominant cases are (1) and (2). (3) is the dominant *user-experience* problem we'll cause on ourselves if we get fail-closed policy wrong. ## 3. Current pipeline ### 3.1 Layers as implemented `MNW/server/src/scanning/` runs six layers per upload (`scanning/mod.rs`): | # | Layer | Implementation | Verdict on absent config | |---|-------|----------------|--------------------------| | 1 | `content_type` | Magic-byte sniffing of file header | Pass / Fail | | 2 | `structural` | Format-specific parser (PE, ELF, Mach-O, etc.) | Pass / Skip | | 3 | `archive` | ZIP / tar walk for nested malware | Pass / Skip | | 4 | `yara` | `yara-x` rule engine | Skip if no rules loaded | | 5 | `clamav` | `clamd` socket over `INSTREAM` | Skip if no socket configured | | 6 | `malwarebazaar` | abuse.ch hash lookup HTTP API | **Error if API shape unexpected** | Final disposition (`scanning/mod.rs::ScanPipeline::scan`): - Any layer `Fail` → `Quarantined` - Any layer `Error` → `HeldForReview` (fail closed) - Otherwise → `Clean` ### 3.2 Gaps observed | Gap | Impact | |-----|--------| | ClamAV daemon not installed on prod | Layer always `Skip`; baseline AV signal absent | | YARA rules directory empty on prod | Layer always `Skip`; no custom signatures | | MalwareBazaar response shape changed (now requires `Auth-Key` header) | Layer returns `Error` on every call; pipeline returns `HeldForReview` on every upload | | No code-signing verification (Apple notarization, Authenticode, AppImage GPG) | Largest available *positive* trust signal entirely unused | | Synchronous scan on upload request handler | Slow third-party API stalls the upload thread; one stuck third party blocks every concurrent upload | | Fail-closed-by-default for `Error` verdicts on optional layers | Optional best-effort layers can take down the whole pipeline (this is exactly what happened) | | No admin surface for layer-health monitoring | Two-week silent regression with no alert; only noticed when downloads broke | | No admin queue UI beyond a single "held items" list | No bulk re-scan, no per-layer detail, no history, no audit log | | No rescan capability | Held files can't be re-evaluated after a layer is fixed; only path is `UPDATE versions SET scan_status='clean'` | | Scan results stored per-`s3_key` not per-`version_id` | Detail joins go through `s3_key`, breaks if a file is referenced by multiple versions or moved | ### 3.3 Gap that triggered this audit Every upload since 2026-05-10 sits at `held_for_review`. Each `file_scan_results.scan_layers` row shows the same final entry: ``` {"layer":"malwarebazaar","verdict":"error","detail":"Unexpected query_status: unknown"} ``` The MalwareBazaar `get_info` endpoint changed: unauthenticated requests no longer return a `query_status` field. Our parser defaults the missing field to the literal string `"unknown"`, which falls through the match arm and returns `Error`. The fail-closed policy then converts that into `HeldForReview` for every upload. No alert fired. The only signal was a user trying to download a GO build and getting a generic "Failed to get download URL" toast. ## 4. Target architecture Three structural shifts: 1. **Async scan, sync upload.** Upload returns immediately with `Pending` status. Scan runs in a background worker. Status flips when scan completes. Downloads gate on terminal status (`Clean`, `Quarantined`). 2. **Explicit per-layer fail policy.** Each layer declares at registration whether `Error` is fail-open (Skip-equivalent) or fail-closed (`HeldForReview`). No global has-error switch. 3. **Multi-signal detection with positive trust signals.** Code-signing and notarization checks contribute *evidence of trust*, not just absence of threats. A properly notarized macOS binary from a verified Dev ID team is strong positive evidence; an unsigned `.exe` is weaker baseline. Today neither signal exists. ### 4.1 Layer set (post-audit) | # | Layer | Source | Fail-open or fail-closed on layer error | |---|-------|--------|------------------------------------------| | 1 | `content_type` | Magic-byte sniffing, in-process | fail-closed (cheap, deterministic) | | 2 | `structural` | Format parsers, in-process | fail-closed (cheap, deterministic) | | 3 | `archive` | Nested-archive walk, in-process | fail-closed (cheap, deterministic) | | 4 | `yara` | yara-x + Florian Roth `signature-base` ruleset | fail-closed (in-process, deterministic) | | 5 | `clamav` | `clamd` daemon + `freshclam` cron | fail-open (network-dependent local service) | | 6 | `signing_macos` | `codesign --verify --deep --strict` + `spctl --assess --type exec` + notarization staple check | fail-open on macOS-only files; positive evidence if pass | | 7 | `signing_windows` | `signtool verify /pa` + cert chain inspection | fail-open on Windows-only files; positive evidence if pass | | 8 | `signing_linux` | AppImage GPG signature + zsync URL presence; deb/rpm signatures | fail-open; positive evidence if pass | | 9 | `abuse_malwarebazaar` | abuse.ch hash lookup with `Auth-Key` header | fail-open (third-party network) | | 10 | `abuse_urlhaus` | URL reputation on strings extracted from binary | fail-open (third-party network) | | 11 | `metadefender_cloud` | OPSWAT free tier (40/day), **second-opinion only** on YARA/ClamAV flags | fail-open (rate-limited, optional) | | 12 | `hybrid_analysis` | CrowdStrike Falcon Sandbox free key (30/month), **admin-triggered only** | manual, not on the auto-path | **Policy:** layers 1-4 are deterministic in-process work and fail closed because an internal bug producing `Error` is a code defect, not an outage. Layers 5+ are network or local-service dependencies and fail open because external regressions must never take down the platform — they degrade signal quality, not availability. ### 4.2 Status machine ``` ┌──────────────────┐ upload accepted ───▶ Pending │ └────────┬─────────┘ │ scan worker picks up ┌────────▼─────────┐ │ Scanning │ └────────┬─────────┘ ┌──────────────┼──────────────┐ ▼ ▼ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Clean │ │HeldForReview│ │ Quarantined │ └─────────────┘ └──────┬──────┘ └─────────────┘ │ admin Promote ▼ ┌─────────────┐ │ Clean │ └─────────────┘ │ admin Quarantine ▼ ┌─────────────┐ │ Quarantined │ └─────────────┘ ``` Transitions: | From | To | Trigger | |------|----|---------| | (no row) | Pending | Upload confirmed (S3 object created, row inserted) | | Pending | Scanning | Worker dequeues | | Scanning | Clean | All deterministic layers pass, no `Fail` in any layer | | Scanning | Quarantined | Any layer returns `Fail` | | Scanning | HeldForReview | Any fail-closed layer returns `Error`, or admin policy triggers (size cap, file type, creator new-account, etc.) | | HeldForReview | Clean | Admin promotes; audit-logged | | HeldForReview | Quarantined | Admin quarantines; audit-logged with note | | Clean | Scanning | Admin-triggered re-scan | | Quarantined | (no transition without DB-level intervention) | Quarantine is sticky by design | ### 4.3 Re-scan cadence - **Trigger re-scan automatically** when: - YARA ruleset is updated (operator-controlled). - ClamAV `freshclam` rolls a new sig DB version. - An admin explicitly clicks Re-scan. - **Background sweep**: every 30 days, re-run hash-lookup layers (abuse.ch, optionally MetaDefender) across the full `Clean` corpus. Detects sigs that have *become* known-bad over time. Quarantine on `Fail`, log to audit. ### 4.4 Async architecture Move scan off the upload critical path: ``` POST /api/versions/{id}/upload/confirm → create version row (scan_status=Pending) → enqueue scan_job → return 200 with status="pending" → client polls or subscribes via SSE scan_worker (tokio task in same process) → SELECT FOR UPDATE SKIP LOCKED jobs WHERE status='pending' → run pipeline → INSERT INTO file_scan_results (per-layer JSON) → UPDATE versions.scan_status → publish status-changed event (SSE channel for admin + creator) ``` Job table: ```sql CREATE TABLE scan_jobs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), target_kind TEXT NOT NULL CHECK (target_kind IN ('version', 'item_cover', 'item_attachment')), target_id UUID NOT NULL, s3_key TEXT NOT NULL, status TEXT NOT NULL CHECK (status IN ('queued', 'running', 'done', 'failed')), attempts INT NOT NULL DEFAULT 0, enqueued_at TIMESTAMPTZ NOT NULL DEFAULT now(), started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ, last_error TEXT ); CREATE INDEX scan_jobs_status_enqueued ON scan_jobs (status, enqueued_at); ``` Worker pool: N workers (configurable, default 2), each pulling with `SELECT ... FOR UPDATE SKIP LOCKED LIMIT 1`. Job retry policy: 3 attempts on transient failure, then `status='failed'`, surface in admin dashboard. ### 4.5 Creator-visible UX In the creator dashboard, each version row gains a scan-status badge: - **Pending** — neutral, "Scanning…" - **Clean** — positive, "Cleared" - **HeldForReview** — warning, "Awaiting review (typically under 24h)" - **Quarantined** — negative, "Quarantined — [contact support] to appeal" Held / quarantined rows expand to show per-layer detail: which layer flagged, what it said, what the creator can do. Honest, transparent, brand-aligned. Public-facing download buttons: - Hidden entirely while `Pending` or `Scanning`. - Visible on `Clean`. - Hidden on `HeldForReview` / `Quarantined` (creator sees a note in their dashboard; public sees nothing — graceful degradation). ## 5. /admin/uploads dashboard The audit surface. Existing route at `routes/admin/uploads.rs` is the seed; we extend it. ### 5.1 Page layout ``` ┌─────────────────────────────────────────────────────────────────────┐ │ Pipeline Health (last 24h / 7d toggle) │ │ ┌─────────────────┬──────────────┬──────────────┬─────────────────┐ │ │ │ content_type │ 100% 1ms │ 100% 1ms │ ✓ last: now │ │ │ │ structural │ 100% 2ms │ 100% 2ms │ ✓ last: now │ │ │ │ yara │ 99% 45ms │ 99% 43ms │ ✓ last: 12s ago │ │ │ │ clamav │ 0% — │ 0% — │ ✗ not running │ │ │ │ malwarebazaar │ 0% — │ 0% — │ ✗ 14 days ago │ │ │ │ signing_macos │ 98% 320ms │ 97% 350ms │ ✓ last: 1m ago │ │ │ └─────────────────┴──────────────┴──────────────┴─────────────────┘ │ └─────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────┐ │ Active Queue (Pending + Scanning) [auto-refresh on] │ │ │ │ 3 files scanning, 0 stuck │ │ • GoingsOn_0.4.0_aarch64.dmg — scanning (4s) │ │ • SamplePack.zip — pending (queued 1s ago) │ └─────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────┐ │ Held for Review 9 items │ │ │ │ [+] GoingsOn_0.3.1_x64-setup.exe │ │ creator: max • item: GoingsOn Desktop • 14 days held │ │ layers: ✓ct ✓struct -arch -yara -clam ⚠mb │ │ [Promote] [Quarantine] [Re-scan] │ │ │ │ [+] GoingsOn_0.3.1_amd64.AppImage ... │ └─────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────────┐ │ Recent History (last 30d) [▶ expand] [filter ▾] │ └─────────────────────────────────────────────────────────────────────┘ ``` ### 5.2 Section spec **Pipeline Health (top panel)** - Per-layer rolling stats over the last 24h and 7d. - Columns: layer name, success rate, error rate, p50 latency, p95 latency, health badge, last successful response timestamp. - Health badge logic: `✓` if last successful response < 1h ago AND error rate < 10%; `⚠` if either degraded; `✗` if no successful response in 24h or error rate > 50%. - Click any row → drill-down to last 100 layer invocations with full per-call detail. Useful for diagnosing intermittent regressions. **Active Queue (Pending + Scanning)** - Auto-refreshes via HTMX SSE. - Shows count of files Pending vs Scanning. - "Stuck" detection: anything in `Scanning` for > 5 minutes is flagged red. - One-line entry per file: filename, current state, elapsed time. No per-layer detail at this stage (scan not done yet). **Held for Review** - Default expanded — these need decisions. - One row per held version. Per row: - Creator handle + item title + version filename + size + age-of-hold. - Layer chips: small colored squares, one per layer, showing verdict (`pass`, `skip`, `fail`, `error`, `pending`). - Click any chip → in-place expand to the layer's `detail` JSON. - Three actions: **Promote** (with optional note), **Quarantine** (note required), **Re-scan** (re-runs pipeline now). - Bulk operations: - Select N rows → **Bulk Re-scan** (no note required; common after a layer fix lands). - Select N rows → **Bulk Promote** (single shared note required, hard audit-logged). - No bulk Quarantine — every quarantine is an individual decision. - Sort: default by held-at ascending (oldest first); switchable. - Filter: by app, by creator, by which layer flagged, by file type. **Recent History (collapsible)** - Default collapsed. When expanded: dense grid of last 30d, all statuses. - Columns: creator, item, version, status, scanned-at, layer-summary chip strip, action. - Filter: by status, app, creator, date range. - Pagination at 100 rows; further history at `/admin/uploads/archive`. - Each row click → same expandable per-layer detail panel. **Audit Trail** - New table: ```sql CREATE TABLE scan_admin_actions ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), version_id UUID, item_id UUID, admin_id UUID NOT NULL, action TEXT NOT NULL CHECK (action IN ('promote', 'quarantine', 'rescan', 'bulk_promote', 'bulk_rescan')), prev_status TEXT, new_status TEXT, note TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); ``` - Inline tooltip on each row: "Last action: promoted by max, 2 days ago". - Full log at `/admin/uploads/audit` with filters. ### 5.3 Access control - Gated on existing `AdminUser` extractor (PLATFORM_ADMIN_ID single-user model). No per-forum-style role splitting. - All POST routes CSRF-protected (existing middleware). - All admin actions audit-logged (table above). ### 5.4 Live updates - HTMX SSE channel `/admin/uploads/events` pushing: - `scan-started`, `scan-completed`, `scan-stuck` events. - Active Queue + Pipeline Health update without page reload. - History grid stays static; reloads only on filter change. ## 6. Monitoring and alerting Add PoM checks for the scan pipeline: | Check | Threshold | Action on fire | |-------|-----------|----------------| | Per-layer error rate (1h window) | > 10% | Notify admin (email + dashboard banner) | | Per-layer success count (24h) | == 0 | Page admin: layer fully down | | Queue depth | > 50 pending | Notify: workers falling behind | | Stuck-scan count (Scanning > 5min) | > 5 | Notify: stuck workers | | Held-for-review count | > 100 | Notify: review backlog growing | PoM module: `pom/src/checks/scan_pipeline.rs`, queries the same `/admin/uploads` data endpoints (or hits the DB directly via the existing PoM SSH pattern). ## 7. Public transparency Public-facing page at `/about/scanning` (DocEngine markdown). Contents: - What we scan with: each layer named, linked to its docs. - What status each can produce. - What "Clean" actually means and what it doesn't. - Aggregate stats (auto-substituted via assumptions/derived values): - "X% of uploads cleared automatically in <2min, last 30 days." - "Y files manually reviewed, last 30 days." - "Z files quarantined, last 30 days." - Creator appeals process for false positives. Per-version public scan-result panel on the public item page: - "This version was scanned on [date]. Cleared by N layers." - No per-layer detail surfaced publicly (avoids handing attackers a layer-by- layer evasion roadmap), but the existence of multi-layer scanning is visible. Brand alignment: this is the kind of unforced transparency that almost no distribution platform does. Differentiation surface, not just an implementation detail. ## 8. Sequencing Implementation order, sized for sequential landing: ### Phase 1 — Architectural floor 1. `scan_jobs` table + worker pool + status machine. 2. Move scan off the upload request handler. 3. Add `Pending`, `Scanning` to `FileScanStatus` enum. 4. Per-layer fail policy declared at registration; remove global has-error switch. 5. Tests: pipeline runs async, status flips correctly, fail-open vs fail-closed honored per layer. ### Phase 2 — Admin surface 6. Extend `/admin/uploads` to the three-section layout + health panel. 7. `scan_admin_actions` table + audit logging on every admin action. 8. HTMX SSE for live queue + health updates. 9. Per-layer detail expansion, bulk re-scan, bulk promote. ### Phase 3 — Layer set 10. Fix MalwareBazaar — register `Auth-Key`, add header, parse new response shape, tests against captured fixtures. 11. Install ClamAV daemon on prod + `freshclam` cron in deploy script. Wire `CLAMAV_SOCKET` env. 12. Pull Florian Roth `signature-base` YARA rules into prod; wire `YARA_RULES_DIR`. Add ruleset-version field to scan results so we can correlate. 13. URLhaus layer (string extraction + URL reputation). 14. Signing-trust layers (macOS, Windows, AppImage). These need helper binaries on prod (`codesign`, `spctl`, `signtool`, `gpg`) — vendor or install. Plan vendoring carefully: `codesign` is macOS-only, so the Hetzner-Linux server can't run it. **Open question**: cross-platform Mach-O signature verification. Candidates: `apple-codesign` Rust crate (Gregory Szorc's `rcodesign`), which can verify Apple signatures on Linux. Verify before committing. 15. MetaDefender Cloud free tier — second-opinion layer, triggered only when YARA or ClamAV flag a suspicion. ### Phase 4 — Operations 16. PoM scan-pipeline checks + alerting. 17. Re-scan sweeps (admin-triggered + monthly background). 18. Public `/about/scanning` page + per-version public scan panel. 19. Held-file backlog: re-scan the existing 9 held versions under the new pipeline. If they come out Clean (expected; they're our own builds), they flip automatically. If anything flags, we investigate manually. ### Phase 5 — Reserved for future - Hybrid Analysis sandbox detonation on admin-flagged samples. - Hash-reputation pass: same SHA shipped Clean by trusted creator = fast-pass on a re-upload by the same creator. Cross-creator fast-pass is not safe (a malicious creator could pre-clear a hash) and is out of scope. - Creator-attested provenance: SLSA-style supply-chain attestations, reproducible-build verification. Long-term, separate document. ## 9. What we explicitly chose not to do - **VirusTotal / Google Threat Intelligence**. Free tier ToS forbids commercial workflow use. Paid tier ($20–50K/yr floor) is GCP-locked vendor with hostile migration behavior (prepaid credits voided in GTI migration). Misaligned with platform brand. Future revisit only if upload volume passes ~1K/day **and** the current stack misses a real incident. - **Synchronous-scan retention**. Even with all layers fixed, holding the upload thread on third-party calls is structurally fragile. - **Per-layer-error fail-closed by default**. The single decision that caused this audit. Reversed. - **Global YARA-ruleset auto-update from upstream**. Roth's `signature-base` is curated but YARA rules can have false positives; we pin a ruleset version and bump deliberately, with re-scan of recent uploads on bump. ## 10. Open questions - **macOS signature verification from Linux**. `rcodesign` looks viable but unverified. If not, a separate scan worker on a macOS host (or accepting signing-status only when the creator's upload tool self-reports it, cross-checked against the embedded signature blob) is the fallback. - **Where does `scan_jobs` live?** Same Postgres or a dedicated queue (Redis, RabbitMQ)? Default: Postgres + `SKIP LOCKED`, no new infra. Revisit if queue depth + worker latency demand it. Probably never. - **Bulk-promote audit threshold.** Should bulk-promote require dual-control (a second admin's approval) above N rows? Today single-operator, so the question is partly theoretical, but it shapes the schema. - **Public scan-result panel detail level.** Per-layer breakdown helps honest creators see what we evaluated; helps attackers fingerprint our pipeline. Default: aggregate verdict only, with the layer list named but not per-file verdicts. Decide on first iteration. ## 11. Cost summary | Item | Cost | |------|------| | abuse.ch (MalwareBazaar / URLhaus / ThreatFox) | $0 with free Auth-Key | | ClamAV + `freshclam` | $0 | | YARA + Roth `signature-base` ruleset | $0 | | Apple notarization staple verify (`rcodesign`) | $0 | | Authenticode signature verify | $0 | | AppImage GPG signature verify | $0 | | MetaDefender Cloud free tier (40/day) | $0 | | Hybrid Analysis free key (30/month) | $0 | | PoM monitoring | $0 (existing infra) | | **Total recurring cost** | **$0** | If upload volume + incident pressure justify it later: MetaDefender paid (~$5–15K/yr estimated, commercial-use-licensed, no GCP lock-in) before any consideration of VT/GTI. --- ## Appendix A: File and module touchpoints | Concern | File(s) | |---------|---------| | Pipeline orchestration | `MNW/server/src/scanning/mod.rs` | | Per-layer impls | `MNW/server/src/scanning/{yara,clamav,hash_lookup,...}.rs` | | New: signing layers | `MNW/server/src/scanning/signing/{macos,windows,linux}.rs` | | New: URLhaus layer | `MNW/server/src/scanning/urlhaus.rs` | | Status enum | `MNW/server/src/db/enums.rs` (`FileScanStatus`) | | Versions / scan-status columns | `MNW/server/src/db/scanning.rs`, `migrations/004_file_scan_status.sql` | | New: `scan_jobs` worker | `MNW/server/src/scanning/worker.rs` | | New: `scan_admin_actions` audit log | `MNW/server/src/db/scan_admin_actions.rs` | | Admin dashboard route | `MNW/server/src/routes/admin/uploads.rs` | | Admin dashboard templates | `MNW/server/templates/pages/admin/uploads*.html` | | Download gate | `MNW/server/src/routes/storage/downloads.rs` | | PoM checks | `MNW/pom/src/checks/scan_pipeline.rs` | | Public transparency page | `MNW/server/site-docs/public/about/scanning.md` | ## Appendix B: Operator rollout procedure One-time prod setup for Phases 3a / 3d. Each step is independent; do them in any order. After each, the corresponding Pipeline Health card on the admin dashboard flips from down to ok within one upload cycle. ### abuse.ch Auth-Key (Phase 3a) 1. Register at . Free, single email confirmation. 2. Add to `/opt/makenotwork/.env`: ``` ABUSE_CH_AUTH_KEY= ``` 3. `systemctl restart makenotwork`. 4. Watch the `malwarebazaar` and `urlhaus` cards flip after the next upload. ### ClamAV daemon (Phase 3d) Run as root on the Hetzner prod host (one-time): ``` /opt/makenotwork/deploy/setup-clamav.sh echo 'CLAMAV_SOCKET=/var/run/clamav/clamd.ctl' >> /opt/makenotwork/.env systemctl restart makenotwork ``` The script installs `clamav-daemon` + `clamav-freshclam`, waits for the initial signature DB pull (up to 5 minutes), and verifies clamd is reachable over its Unix socket. Signatures auto-update via `freshclam.service`. ### YARA rules (Phase 3d) ``` /opt/makenotwork/deploy/setup-yara-rules.sh echo 'YARA_RULES_DIR=/opt/makenotwork/yara-rules' >> /opt/makenotwork/.env systemctl restart makenotwork ``` Pulls Florian Roth's `signature-base` (CC-BY-NC 4.0) shallow clone, flattens the `.yar` files into `/opt/makenotwork/yara-rules`, installs a weekly cron to refresh upstream. Stamps the active commit SHA at `/opt/makenotwork/yara-rules/RULESET_VERSION` for audit-trail correlation. A `systemctl restart makenotwork` is required to recompile rules after every ruleset bump — the cron only refreshes the files on disk, not the in-memory compiled rules. --- ## Appendix C: Migration plan for current held backlog 9 versions in `held_for_review` since 2026-05-10. Plan: 1. Implement Phase 1 + Phase 3.10 (MalwareBazaar fix with Auth-Key). 2. Trigger `Bulk Re-scan` on all 9 from the admin dashboard once the new pipeline is running. 3. Expected outcome: 9 → Clean. They're our own builds and would have passed the original pipeline if MB hadn't errored. 4. If any flag under the new (richer) pipeline, investigate as a normal held-review case. No `UPDATE versions SET scan_status='clean'` shortcut. The whole point of the redesign is that the system promotes a file when it has reason to, not because an admin reached around it.