Servers & Networking
Server Fleet
Three Hetzner Cloud VPS instances, all connected via WireGuard VPN.
| Server | Location | Role | Spec | Internal IP |
|---|---|---|---|---|
| Hermes | hel1 (Helsinki) | Production + CI runner | 4 vCPU, 8GB RAM, 80GB NVMe | 10.1.0.1 |
| Atlas | nbg1 (Nuremberg) | Staging + CI runner | 2 vCPU, 4GB RAM, 40GB SSD | 10.1.0.2 |
| Iris | (observability) | Observability + Docs | 2 vCPU, 4GB RAM, 40GB SSD | 10.1.0.4 |
Hermes runs production workloads. The GitHub Actions self-hosted runner also runs on Hermes — a documented risk accepted at current scale. Post-launch, the runner should move to a dedicated instance.
Atlas fills up quickly from Docker image accumulation. A daily cron at 00:01 Athens time runs docker system prune to clean up. See Disk Management runbook.
WireGuard VPN (Olympus Network)
All three servers are on a WireGuard mesh VPN: olympus, 10.1.0.0/24.
All inter-service traffic uses the WireGuard IPs — never the public internet. Examples:
- API → Loki:
http://10.1.0.4:3100 - Backups: Hermes DB →
pg_dump→ copy to Iris over WireGuard
Domains & Routing
All public traffic: Cloudflare (CDN + TLS) → Hetzner firewall (Cloudflare IPs only on 80/443) → Traefik → Docker containers.
| Domain | Server | Notes |
|---|---|---|
pcmr.gr | Hermes | Production web |
api.pcmr.gr | Hermes | Production API (CORS-locked to pcmr.gr) |
staging.pcmr.gr | Atlas | Staging web |
api-staging.pcmr.gr | Atlas | Staging API |
staff.pcmr.gr | Hermes | Staff portal — Cloudflare Access gated |
staff-staging.pcmr.gr | Atlas | Staging staff portal — Cloudflare Access gated |
docs.pcmr.gr | Iris | This documentation site — Cloudflare Access gated |
status.pcmr.gr | Iris | Uptime Kuma (public) |
coolify.ctsolutions.gr | Hermes | Coolify UI — Cloudflare Access gated |
Cloudflare Access (Zero Trust)
Cloudflare Access policies gate:
coolify.ctsolutions.gr— email OTP authstaff.pcmr.gr+staff-staging.pcmr.gr— email OTP authdocs.pcmr.gr— email OTP auth
These are a first layer of defense. The application still enforces its own auth — Cloudflare Access is defense-in-depth, not the sole gate.
Traefik
Traefik runs on each server as the reverse proxy. It handles SSL termination and routing to Docker containers.
Gotchas:
-
acme.jsonpermissions — Traefik will not start ifacme.jsonhas permissions other than600:chmod 600 /path/to/acme.json -
Cloudflare Universal SSL — Does not cover third-level subdomains (e.g.,
staff.pcmr.gr). Use Full (Strict) mode in Cloudflare SSL/TLS settings, which requires a valid origin certificate. -
tls: {}behind Cloudflare Access — When Traefik is behind Cloudflare Access, usetls: {}(nocertResolver) — Cloudflare handles TLS termination and presents its own cert to clients.certResolverwould try to issue Let's Encrypt certs but ACME challenges would fail because Cloudflare intercepts the traffic. -
Non-root Coolify user — Coolify uses a non-root user. For manual Docker operations not managed by Coolify:
docker compose up -dDo not use
sudofor Coolify-managed containers — it can cause ownership conflicts.
Observability Stack (on Iris)
All services on Iris communicate internally — not exposed publicly (except Uptime Kuma):
| Service | Port | Purpose |
|---|---|---|
| Loki | 10.1.0.4:3100 | Log aggregation (receives logs from API) |
| Grafana | 127.0.0.1:3200 | Log dashboards (access via SSH tunnel) |
| GlitchTip | 127.0.0.1:8080 | Error tracking (Sentry-compatible) |
| Umami | 127.0.0.1:3000 | Privacy-friendly web analytics |
| Uptime Kuma | public :3001 | Uptime monitoring (public at status.pcmr.gr) |
| Homarr | 127.0.0.1:7575 | Internal dashboard |
Accessing Grafana: SSH tunnel required:
ssh -L 3200:127.0.0.1:3200 iris -N
# Then open http://localhost:3200 in browser
Backups (3-2-1 Strategy)
- Live DB on Hermes (PostgreSQL)
- Daily
pg_dump— cron on Hermes, copies dump to Iris over WireGuard (30-day retention) - Manual cold backup — hardware-encrypted Samsung T9 SSD via
scripts/backup-local.ps1
Additionally: Hetzner takes daily VPS snapshots (7-day retention).
Hetzner Object Storage
S3-compatible object storage for file attachments and generated PDFs.
| Environment | Bucket | Region |
|---|---|---|
| Production | mneme | hel1 |
| Staging | mneme-staging | nbg1 |
Data stays in EU data centers (GDPR compliant). Implementation uses the standard AWS SDK with forcePathStyle: true — zero code changes needed if migrating to another S3-compatible provider.