Skip to main content

max / makenotwork

1.5 KB · 41 lines History Blame Raw
1 # Monitoring
2
3 ## Status Page
4
5 [makenot.work/health]https://makenot.work/health is a live status dashboard showing:
6
7 - Overall status (Operational / Degraded / Issues Detected)
8 - 24-hour and 7-day uptime percentages
9 - Per-service status: database, sessions, S3 storage, Stripe payments, email, SyncKit
10 - External monitoring data from PoM (response times, route availability, incidents)
11 - Recent check history and incident log
12 - Live endpoint tests (public URLs and database queries)
13
14 The page is public. A JSON API is also available at `/api/health` for programmatic monitoring.
15
16 ## PoM (Production Operations Monitor)
17
18 PoM is a self-hosted monitoring tool we built. It checks two targets continuously:
19
20 - **makenot.work**: the main platform
21 - **forums.makenot.work**: community forums (Multithreaded)
22
23 For each target, PoM monitors:
24
25 - **Health checks**: HTTP response codes and response times
26 - **TLS certificates**: expiry tracking and chain validation
27 - **Route availability**: key pages return expected status codes
28 - **DNS records**: A, AAAA, MX, TXT, CNAME correctness
29 - **WHOIS**: domain registration and expiry
30
31 PoM sends email alerts when any check fails or degrades and tracks per-test history to detect regressions.
32
33 ## No Third-Party Monitoring
34
35 PoM runs on our own servers, keeps its own database, and sends alerts through our own email infrastructure. No monitoring data leaves our systems.
36
37 ## See Also
38
39 - [Infrastructure]./infrastructure.md: production stack and vendor choices
40 - [Security]./security.md: how we protect data
41