Deployment
Diminuendo is designed for minimal operational overhead. A single binary, a data directory, and a set of environment variables are all that is required. There is no external database to provision, no message broker to configure, and no migration tool to run.Prerequisites
| Requirement | Details |
|---|---|
| Bun | Version 1.0 or later |
| Podium | A reachable Podium coordinator instance |
| Auth0 | An Auth0 application configured for JWT issuance and verification |
| Data directory | A writable directory for SQLite database files |
Building
Diminuendo compiles to a single bundled file using Bun’s built-in bundler:dist/main.js that includes all dependencies. The output can be run directly:
Docker
A minimal Dockerfile for production:Docker Compose
Data Persistence
TheDATA_DIR directory must be persisted across container restarts and deployments. It contains all tenant and session SQLite databases:
Backup
SQLite databases can be backed up by copying files. For a consistent backup during operation:- Per-session: Copy the session directory. SQLite WAL mode ensures the copy is consistent even if the writer is active (the WAL file is included automatically).
- Per-tenant: Copy the tenant directory. This captures the registry and all session databases.
- Full backup: Copy the entire
DATA_DIR.
DATA_DIR.
TLS Termination
Diminuendo serves plain HTTP and WebSocket. TLS should be terminated at the load balancer or reverse proxy:The gateway includes HSTS and security headers in all HTTP responses. Configure your load balancer to preserve these headers or add them at the edge.
Reverse Proxy Configuration (nginx)
proxy_read_timeout and proxy_send_timeout must be set high enough to accommodate long-lived WebSocket connections. The default nginx timeout of 60 seconds will cause premature disconnection.
Graceful Shutdown
When the gateway receivesSIGTERM or SIGINT, it executes a graceful shutdown sequence:
1
Signal Received
The process signal handler unblocks the main Effect fiber.
2
Broadcast Shutdown
A
server_shutdown event (with reason "deployment") is published to every active tenant and session topic. Connected clients receive notification that the server is going down.3
Drain WebSocket Buffers
A 500ms delay allows shutdown events to drain through WebSocket send buffers.
4
Flush SQLite Writes
The WorkerManager’s
shutdown() method flushes all buffered writes in the writer worker and closes all database handles. This ensures no writes are lost.5
Stop Server
server.stop(true) forcefully closes all remaining WebSocket connections and stops the HTTP server.6
Force Exit Timeout
A 10-second timeout guards against shutdown hangs. If the graceful sequence does not complete within 10 seconds,
process.exit(1) is called.Configure your container orchestrator’s stop grace period to exceed the 10-second force exit timeout. For Kubernetes, set
terminationGracePeriodSeconds: 15. For Docker, set stop_grace_period: 15s.Horizontal Scaling
Diminuendo supports horizontal scaling through tenant-affinity routing. Since there is no shared state between instances, each instance independently manages its own set of tenants.Load Balancer Configuration
The load balancer must route all requests for a given tenant to the same gateway instance. Two approaches:- Sticky Sessions (Simple)
- Tenant-Affinity (Recommended)
Use cookie-based or IP-based sticky sessions. The first WebSocket connection from a client is routed to any available instance. Subsequent connections from the same client are routed to the same instance.This works for single-tenant deployments or when each client connects to only one tenant.
Stale Session Recovery
When an instance restarts (whether due to deployment, scaling, or crash), it performs stale session recovery:- Enumerates all known tenant IDs from the
data/tenants/directory - For each tenant, queries the registry for sessions with non-
inactivestatus - Resets each stale session to
inactive
inactive state snapshot and can re-activate the session.
Recovery runs as a background fiber and does not block incoming connections. Up to 4 tenants are reconciled concurrently.
Scaling Considerations
| Concern | Current Model | Scaling Limitation |
|---|---|---|
| Session state | In-memory per instance | Session must reconnect to same instance |
| Event fan-out | Bun pub/sub (per-process) | Cross-instance events require Redis/NATS |
| Rate limiting | Per-instance counters | Distributed attackers bypass per-instance limits |
| Billing | Per-instance credit reservation | Shared tenant across instances needs shared ledger |
Environment Checklist
Before deploying to production, verify:-
NODE_ENV=production(orDEV_MODEis unset/false) -
AUTH_CLIENT_ID,AUTH_CLIENT_SECRET,AUTH_URLare set -
PODIUM_URLandPODIUM_API_KEYare set -
ALLOWED_ORIGINSincludes the frontend’s origin -
DATA_DIRpoints to a persistent volume - TLS is terminated at the load balancer
- Health check configured:
GET /healthexpecting 200 - Container stop grace period exceeds 10 seconds
- Log aggregation configured for JSON log output