| 1 |
# Troubleshooting — MNW Server |
| 2 |
|
| 3 |
## Service Won't Start |
| 4 |
|
| 5 |
``` |
| 6 |
Check logs: journalctl -u makenotwork -n 50 --no-pager |
| 7 |
``` |
| 8 |
|
| 9 |
|
| 10 |
|
| 11 |
| "DATABASE_URL environment variable is required" | Missing env var | Check `/opt/makenotwork/.env` has `DATABASE_URL` | |
| 12 |
| "SIGNING_SECRET is required in production" | HOST=0.0.0.0 or HTTPS HOST_URL without secret | Set `SIGNING_SECRET` to a random string in `.env` | |
| 13 |
| "Invalid HOST address" / "Invalid PORT number" | Malformed HOST or PORT | HOST must be valid IP (default 127.0.0.1), PORT must be integer (default 3000) | |
| 14 |
| Startup hangs then fails | PostgreSQL unreachable | `systemctl status postgresql`, verify `DATABASE_URL` connection string | |
| 15 |
| "Failed to run migrations" | Migration error or DB permissions | Connect manually: `psql $DATABASE_URL -c "SELECT 1"`, check migration SQL | |
| 16 |
| "Failed to migrate session store" | tower_sessions table issue | Usually resolves on retry. If persistent, check DB user has CREATE TABLE permission | |
| 17 |
| Startup succeeds but features missing | Optional services not configured | Stripe, Postmark, S3, Git browser, SyncKit all degrade gracefully if env vars missing | |
| 18 |
|
| 19 |
## 502 Errors |
| 20 |
|
| 21 |
Caddy serves `/opt/makenotwork/error-pages/502.html` when the app is unreachable. |
| 22 |
|
| 23 |
1. **Is the process running?** |
| 24 |
```bash |
| 25 |
systemctl status makenotwork --no-pager |
| 26 |
``` |
| 27 |
- Not running → `systemctl restart makenotwork`, check logs for crash cause |
| 28 |
- Running but not responding → check port: `curl -s http://127.0.0.1:3000` |
| 29 |
|
| 30 |
2. **Is Caddy running?** |
| 31 |
```bash |
| 32 |
systemctl status caddy --no-pager |
| 33 |
``` |
| 34 |
- Not running → `systemctl restart caddy` |
| 35 |
|
| 36 |
3. **Is PostgreSQL running?** |
| 37 |
```bash |
| 38 |
systemctl status postgresql --no-pager |
| 39 |
sudo -u makenotwork psql makenotwork -c "SELECT 1" |
| 40 |
``` |
| 41 |
- Not running → `systemctl restart postgresql`, then `systemctl restart makenotwork` |
| 42 |
|
| 43 |
4. **Port conflict?** |
| 44 |
```bash |
| 45 |
lsof -i :3000 |
| 46 |
``` |
| 47 |
- Another process → kill it or change `PORT` in `.env` |
| 48 |
|
| 49 |
## Slow Queries |
| 50 |
|
| 51 |
**Symptoms:** Pages load slowly, "timeout acquiring connection" in logs, high PostgreSQL CPU. |
| 52 |
|
| 53 |
**Diagnostics:** |
| 54 |
```bash |
| 55 |
# Enable slow query logging |
| 56 |
sudo -u postgres psql -c "ALTER SYSTEM SET log_min_duration_statement = 1000;" |
| 57 |
sudo -u postgres psql -c "SELECT pg_reload_conf();" |
| 58 |
|
| 59 |
# Check active queries |
| 60 |
sudo -u postgres psql -c "SELECT pid, now() - pg_stat_activity.query_start AS duration, query FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC LIMIT 5;" |
| 61 |
``` |
| 62 |
|
| 63 |
**Known patterns:** |
| 64 |
- Discover search with very short terms → triggers trigram scan. The `pg_trgm` extension + GIN index mitigate this. |
| 65 |
- Tag hierarchy queries → EXISTS subqueries on items with many tags. |
| 66 |
- Connection pool exhaustion → default is 25 connections, 3s acquire timeout. If all busy, new requests fail after 3s. |
| 67 |
|
| 68 |
## Stripe Webhook Failures |
| 69 |
|
| 70 |
**Symptoms:** Purchases not completing, subscriptions not updating. |
| 71 |
|
| 72 |
1. **Check Stripe Dashboard → Webhooks** for failed deliveries |
| 73 |
2. **Check server logs:** |
| 74 |
```bash |
| 75 |
journalctl -u makenotwork --since "1 hour ago" | grep -i stripe |
| 76 |
``` |
| 77 |
|
| 78 |
|
| 79 |
|
| 80 |
| "Missing Stripe signature" | Request missing `Stripe-Signature` header | Webhook URL misconfigured in Stripe Dashboard | |
| 81 |
| "Invalid payload encoding" | Non-UTF8 body | Stripe endpoint URL wrong (hitting wrong service) | |
| 82 |
| Signature verification error | `STRIPE_WEBHOOK_SECRET` mismatch | Copy exact secret from Stripe Dashboard → Webhooks, update `.env`, restart | |
| 83 |
| Event type not handled (debug log) | Unhandled event type | Expected — only specific events are processed | |
| 84 |
| "Stripe not configured" | Missing `STRIPE_SECRET_KEY` | Set env var in `.env`, restart | |
| 85 |
|
| 86 |
**Test locally:** |
| 87 |
```bash |
| 88 |
stripe listen --forward-to localhost:3000/stripe/webhook |
| 89 |
stripe trigger checkout.session.completed |
| 90 |
``` |
| 91 |
|
| 92 |
## Email Not Sending |
| 93 |
|
| 94 |
**Symptoms:** Password resets, purchase receipts, or verification emails not arriving. |
| 95 |
|
| 96 |
1. **Is Postmark configured?** |
| 97 |
- Check `.env` for `POSTMARK_TOKEN`. If missing, emails log to stdout (dev mode). |
| 98 |
|
| 99 |
2. **Is the recipient suppressed?** |
| 100 |
```sql |
| 101 |
SELECT * FROM email_suppressions WHERE email = 'user@example.com'; |
| 102 |
``` |
| 103 |
- If found, remove: `DELETE FROM email_suppressions WHERE email = 'user@example.com';` |
| 104 |
|
| 105 |
3. **Check Postmark Dashboard → Activity** for delivery status |
| 106 |
|
| 107 |
4. **Check server logs:** |
| 108 |
```bash |
| 109 |
journalctl -u makenotwork --since "1 hour ago" | grep -i email |
| 110 |
``` |
| 111 |
|
| 112 |
|
| 113 |
|
| 114 |
| "email skipped (suppressed)" | Recipient on suppression list | Remove from `email_suppressions` table | |
| 115 |
| "Failed to send email" | Postmark API error (timeout, auth, invalid address) | Check Postmark Dashboard for details, verify token | |
| 116 |
| Emails logged to console | `POSTMARK_TOKEN` not set | Set env var, restart | |
| 117 |
|
| 118 |
## Sync Failures (SyncKit) |
| 119 |
|
| 120 |
**Symptoms:** Desktop apps can't push/pull data. |
| 121 |
|
| 122 |
1. **Is SyncKit configured?** |
| 123 |
- Check `.env` for `SYNCKIT_JWT_SECRET`. If missing, endpoints return 503. |
| 124 |
|
| 125 |
2. **JWT issues:** |
| 126 |
|
| 127 |
|
| 128 |
|
| 129 |
| 401 Unauthorized | Token expired (7-day max) or bad signature | Client should re-authenticate via `/api/synckit/auth` | |
| 130 |
| "Unknown app" | API key invalid or app inactive | Check `sync_apps` table: `SELECT * FROM sync_apps WHERE api_key = '...'` | |
| 131 |
| "Unknown device" | Device not registered | Client should call `POST /api/sync/devices` first | |
| 132 |
|
| 133 |
3. **Push failures:** |
| 134 |
|
| 135 |
|
| 136 |
|
| 137 |
| "Maximum 500 changes per push" | Batch too large | Client should split into ≤500-change batches | |
| 138 |
| "Table name validation failed" | Invalid chars in table name | Use alphanumeric + underscores only, max 100 chars | |
| 139 |
| "DELETE operations should not include data" | Data payload on DELETE op | Client bug — set `data: null` for DELETEs | |
| 140 |
|
| 141 |
4. **Blob storage:** |
| 142 |
- Check `SYNCKIT_S3_*` env vars for the separate SyncKit bucket |
| 143 |
- If S3 unreachable, blob up/download fails but changelog sync still works |
| 144 |
|
| 145 |
## Git Browser Errors |
| 146 |
|
| 147 |
**Symptoms:** Source browser pages return 404 or 500. |
| 148 |
|
| 149 |
1. **Is the git browser configured?** |
| 150 |
- Check `.env` for `GIT_REPOS_PATH`. If missing, all git routes return 404. |
| 151 |
|
| 152 |
2. **Repository not found:** |
| 153 |
```bash |
| 154 |
ls /opt/git/ # Check bare repos exist |
| 155 |
``` |
| 156 |
- Repos must be bare (`git init --bare`) |
| 157 |
- Path structure: `$GIT_REPOS_PATH/{owner}/{repo}.git/` |
| 158 |
|
| 159 |
3. **File too large (>1MB):** Intentional limit. Large files show truncation message. |
| 160 |
|
| 161 |
4. **Repo corruption:** |
| 162 |
```bash |
| 163 |
cd /opt/git/owner/repo.git && git fsck --full |
| 164 |
``` |
| 165 |
|
| 166 |
## Resource Limits |
| 167 |
|
| 168 |
|
| 169 |
|
| 170 |
| DB connections | 25 max | "timeout acquiring connection" after 3s wait | |
| 171 |
| Memory | 512M (systemd MemoryMax) | Process killed by OOM, auto-restarts | |
| 172 |
| File descriptors | 65535 (LimitNOFILE) | "too many open files" | |
| 173 |
| File upload: audio | 500 MB | 413 Payload Too Large | |
| 174 |
| File upload: image | 10 MB | 413 Payload Too Large | |
| 175 |
| File upload: video | 20 GB | 413 Payload Too Large | |
| 176 |
| Login rate limit | 2/sec, burst 5 | 429 Too Many Requests | |
| 177 |
| API rate limit | 2/sec, burst 10 | 429 Too Many Requests | |
| 178 |
| SyncKit rate limit | 10/sec, burst 30 | 429 Too Many Requests | |
| 179 |
|