# Troubleshooting — MNW Server ## Service Won't Start ``` Check logs: journalctl -u makenotwork -n 50 --no-pager ``` | Symptom | Cause | Fix | |---------|-------|-----| | "DATABASE_URL environment variable is required" | Missing env var | Check `/opt/makenotwork/.env` has `DATABASE_URL` | | "SIGNING_SECRET is required in production" | HOST=0.0.0.0 or HTTPS HOST_URL without secret | Set `SIGNING_SECRET` to a random string in `.env` | | "Invalid HOST address" / "Invalid PORT number" | Malformed HOST or PORT | HOST must be valid IP (default 127.0.0.1), PORT must be integer (default 3000) | | Startup hangs then fails | PostgreSQL unreachable | `systemctl status postgresql`, verify `DATABASE_URL` connection string | | "Failed to run migrations" | Migration error or DB permissions | Connect manually: `psql $DATABASE_URL -c "SELECT 1"`, check migration SQL | | "Failed to migrate session store" | tower_sessions table issue | Usually resolves on retry. If persistent, check DB user has CREATE TABLE permission | | Startup succeeds but features missing | Optional services not configured | Stripe, Postmark, S3, Git browser, SyncKit all degrade gracefully if env vars missing | ## 502 Errors Caddy serves `/opt/makenotwork/error-pages/502.html` when the app is unreachable. 1. **Is the process running?** ```bash systemctl status makenotwork --no-pager ``` - Not running → `systemctl restart makenotwork`, check logs for crash cause - Running but not responding → check port: `curl -s http://127.0.0.1:3000` 2. **Is Caddy running?** ```bash systemctl status caddy --no-pager ``` - Not running → `systemctl restart caddy` 3. **Is PostgreSQL running?** ```bash systemctl status postgresql --no-pager sudo -u makenotwork psql makenotwork -c "SELECT 1" ``` - Not running → `systemctl restart postgresql`, then `systemctl restart makenotwork` 4. **Port conflict?** ```bash lsof -i :3000 ``` - Another process → kill it or change `PORT` in `.env` ## Slow Queries **Symptoms:** Pages load slowly, "timeout acquiring connection" in logs, high PostgreSQL CPU. **Diagnostics:** ```bash # Enable slow query logging sudo -u postgres psql -c "ALTER SYSTEM SET log_min_duration_statement = 1000;" sudo -u postgres psql -c "SELECT pg_reload_conf();" # Check active queries sudo -u postgres psql -c "SELECT pid, now() - pg_stat_activity.query_start AS duration, query FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC LIMIT 5;" ``` **Known patterns:** - Discover search with very short terms → triggers trigram scan. The `pg_trgm` extension + GIN index mitigate this. - Tag hierarchy queries → EXISTS subqueries on items with many tags. - Connection pool exhaustion → default is 25 connections, 3s acquire timeout. If all busy, new requests fail after 3s. ## Stripe Webhook Failures **Symptoms:** Purchases not completing, subscriptions not updating. 1. **Check Stripe Dashboard → Webhooks** for failed deliveries 2. **Check server logs:** ```bash journalctl -u makenotwork --since "1 hour ago" | grep -i stripe ``` | Log Message | Cause | Fix | |-------------|-------|-----| | "Missing Stripe signature" | Request missing `Stripe-Signature` header | Webhook URL misconfigured in Stripe Dashboard | | "Invalid payload encoding" | Non-UTF8 body | Stripe endpoint URL wrong (hitting wrong service) | | Signature verification error | `STRIPE_WEBHOOK_SECRET` mismatch | Copy exact secret from Stripe Dashboard → Webhooks, update `.env`, restart | | Event type not handled (debug log) | Unhandled event type | Expected — only specific events are processed | | "Stripe not configured" | Missing `STRIPE_SECRET_KEY` | Set env var in `.env`, restart | **Test locally:** ```bash stripe listen --forward-to localhost:3000/stripe/webhook stripe trigger checkout.session.completed ``` ## Email Not Sending **Symptoms:** Password resets, purchase receipts, or verification emails not arriving. 1. **Is Postmark configured?** - Check `.env` for `POSTMARK_TOKEN`. If missing, emails log to stdout (dev mode). 2. **Is the recipient suppressed?** ```sql SELECT * FROM email_suppressions WHERE email = 'user@example.com'; ``` - If found, remove: `DELETE FROM email_suppressions WHERE email = 'user@example.com';` 3. **Check Postmark Dashboard → Activity** for delivery status 4. **Check server logs:** ```bash journalctl -u makenotwork --since "1 hour ago" | grep -i email ``` | Log Message | Cause | Fix | |-------------|-------|-----| | "email skipped (suppressed)" | Recipient on suppression list | Remove from `email_suppressions` table | | "Failed to send email" | Postmark API error (timeout, auth, invalid address) | Check Postmark Dashboard for details, verify token | | Emails logged to console | `POSTMARK_TOKEN` not set | Set env var, restart | ## Sync Failures (SyncKit) **Symptoms:** Desktop apps can't push/pull data. 1. **Is SyncKit configured?** - Check `.env` for `SYNCKIT_JWT_SECRET`. If missing, endpoints return 503. 2. **JWT issues:** | Error | Cause | Fix | |-------|-------|-----| | 401 Unauthorized | Token expired (7-day max) or bad signature | Client should re-authenticate via `/api/synckit/auth` | | "Unknown app" | API key invalid or app inactive | Check `sync_apps` table: `SELECT * FROM sync_apps WHERE api_key = '...'` | | "Unknown device" | Device not registered | Client should call `POST /api/sync/devices` first | 3. **Push failures:** | Error | Cause | Fix | |-------|-------|-----| | "Maximum 500 changes per push" | Batch too large | Client should split into ≤500-change batches | | "Table name validation failed" | Invalid chars in table name | Use alphanumeric + underscores only, max 100 chars | | "DELETE operations should not include data" | Data payload on DELETE op | Client bug — set `data: null` for DELETEs | 4. **Blob storage:** - Check `SYNCKIT_S3_*` env vars for the separate SyncKit bucket - If S3 unreachable, blob up/download fails but changelog sync still works ## Git Browser Errors **Symptoms:** Source browser pages return 404 or 500. 1. **Is the git browser configured?** - Check `.env` for `GIT_REPOS_PATH`. If missing, all git routes return 404. 2. **Repository not found:** ```bash ls /opt/git/ # Check bare repos exist ``` - Repos must be bare (`git init --bare`) - Path structure: `$GIT_REPOS_PATH/{owner}/{repo}.git/` 3. **File too large (>1MB):** Intentional limit. Large files show truncation message. 4. **Repo corruption:** ```bash cd /opt/git/owner/repo.git && git fsck --full ``` ## Resource Limits | Resource | Limit | What Happens | |----------|-------|-------------| | DB connections | 25 max | "timeout acquiring connection" after 3s wait | | Memory | 512M (systemd MemoryMax) | Process killed by OOM, auto-restarts | | File descriptors | 65535 (LimitNOFILE) | "too many open files" | | File upload: audio | 500 MB | 413 Payload Too Large | | File upload: image | 10 MB | 413 Payload Too Large | | File upload: video | 20 GB | 413 Payload Too Large | | Login rate limit | 2/sec, burst 5 | 429 Too Many Requests | | API rate limit | 2/sec, burst 10 | 429 Too Many Requests | | SyncKit rate limit | 10/sec, burst 30 | 429 Too Many Requests |