Architecture Overview

Diminuendo is a WebSocket gateway built on three foundational technology choices: Bun for the runtime, Effect TS for business logic, and SQLite for persistence. Each choice was made deliberately to minimize operational complexity while maximizing correctness and performance.

Runtime: Bun

Bun provides three capabilities that are essential to Diminuendo’s architecture:
  1. Native WebSocket server with built-in pub/sub — Bun’s Bun.serve() supports topic-based publish/subscribe directly in the WebSocket handler, eliminating the need for an external message broker (Redis pub/sub, NATS, etc.) for single-instance deployments
  2. Native SQLite via bun:sqlite — synchronous, in-process SQLite with prepared statements, WAL mode, and zero serialization overhead
  3. Fast startup — the gateway starts in under 200ms, enabling rapid restart cycles during development and minimal downtime during deployment
Bun’s WebSocket pub/sub is per-process. For multi-instance deployments, a broadcast layer (Redis pub/sub or similar) would be needed. The current architecture is designed for vertical scaling — a single instance handling thousands of concurrent sessions.

Business Logic: Effect TS

All business logic is written using Effect, a TypeScript library for building type-safe, composable programs. Effect provides:
  • Typed errors — every function declares its failure modes in the type signature; there are no thrown exceptions
  • Dependency injection — services are composed via Layer, not imported as singletons
  • Structured concurrency — background fibers are tracked and interrupted on shutdown
  • Resource management — database connections, WebSocket handles, and timers are acquired and released within scoped effects
For a detailed treatment of the Effect patterns used in Diminuendo, see Effect TS Patterns.

Storage: SQLite with WAL Mode

Diminuendo uses SQLite exclusively for persistence — no Postgres, no Redis, no external database. The storage architecture uses a hierarchy of databases:
data/
  tenants/
    {tenantId}/
      registry.db          # Session metadata for this tenant
  sessions/
    {sessionId}/
      session.db           # Conversation history, events, and turn usage
Each tenant has a registry database containing session metadata (id, name, status, timestamps). Each session has its own session database containing messages, events, and turn usage records. This isolation provides several benefits:

Per-Tenant Isolation

A query against one tenant’s data cannot accidentally touch another tenant’s rows. There is no WHERE tenant_id = ? clause to forget — the databases are physically separate files.

Zero-Contention Writes

Concurrent writes to different sessions hit different SQLite files. WAL mode allows concurrent readers and a single writer per database, which is sufficient for the gateway’s access pattern (one writer per session).

Trivial Deletion

Deleting a session means deleting a directory. No DELETE FROM across multiple tables, no orphaned rows, no vacuum needed.

Horizontal Scaling

Moving a tenant to a different gateway instance means moving a directory. No data migration, no schema changes, no downtime.
Migrations are applied automatically on first access. The registry database schema includes sessions and tenant members tables; the session database schema includes messages, events, and turn usage tables.

Transport Layer

The transport layer consists of two components: the HTTP/WebSocket server and the Broadcaster.

Server (src/transport/server.ts)

Bun.serve() handles both HTTP (health checks) and WebSocket (client connections) on a single port. The WebSocket lifecycle is:
Client connects → fetch() upgrades to WS → open() sends welcome + connected
  → In dev mode: auto-sends authenticated
  → In production: client must send authenticate with JWT
→ Authenticated: client may send any message
→ message() → Schema.decodeUnknownEither → MessageRouter.route()
→ close() → cleanup rate limiters, unsubscribe topics, remove from active sessions
Each connection carries typed per-connection state (WsData): client ID, authentication identity, topic subscriptions, and the last event sequence number.

Broadcaster (src/transport/Broadcaster.ts)

The Broadcaster abstracts Bun’s native pub/sub into an Effect service. It provides two publishing channels:
  • Session events — published to session:{sessionId}, received by all clients that have joined that session
  • Tenant events — published to tenant:{tenantId}:sessions, received by all authenticated clients in the tenant (used for session list updates)
The Broadcaster also tracks all known topics for graceful shutdown — when the gateway receives SIGINT/SIGTERM, it publishes a server_shutdown event to every active topic before closing connections.

Module Layout

The gateway source is organized into eleven modules, each with a single responsibility:
ModuleDirectoryResponsibility
Authsrc/auth/JWT verification (Auth0), identity extraction, RBAC permission checks, tenant membership management
Transportsrc/transport/WebSocket server, HTTP health endpoint, Broadcaster service
Protocolsrc/protocol/Effect Schema definitions for all 21 client message types; runtime validation and parsing
Sessionsrc/session/Session lifecycle management, MessageRouter (the central dispatch), ConnectionState, SessionState machine, event handlers
Automationsrc/automation/Automation store, scheduler, run execution, heartbeat configuration, and inbox management
Domainsrc/domain/Business rules — PodiumEventMapper (translates agent events to client events), BillingService (credit reservation and settlement)
Upstreamsrc/upstream/External service clients — PodiumClient (agent orchestration), EnsembleClient (LLM inference)
Securitysrc/security/CSRF protection, security headers, error message sanitization, auth rate limiting
Resiliencesrc/resilience/RetryPolicy (exponential backoff with jitter), CircuitBreaker (failure threshold with cooldown)
Observabilitysrc/observability/Health endpoint (deep health checks against Podium and Ensemble), OpenTelemetry tracing
DBsrc/db/Schema migrations, WorkerManager (batched async writes to SQLite)

Layer Composition

Diminuendo uses Effect’s Layer system for dependency injection. The composition is defined in src/main.ts:
AppConfigLive (environment variables)
  |
  +---> AuthLayer          (JWT verification)
  +---> RegistryLayer      (session metadata in SQLite)
  +---> PodiumLayer        (Podium HTTP/WS client)
  +---> BroadcastLayer     (pub/sub abstraction)
  +---> WorkerLayer        (async SQLite writes)
  +---> BillingLayer       (credit management)  <-- depends on WorkerLayer
  +---> EnsembleLayer      (LLM inference client)
  +---> MembershipLayer    (tenant RBAC)
  |
  +---> RouterDeps = merge(Registry, Podium, AppConfig, Broadcast, Billing, Worker, Membership)
          |
          +---> RouterLayer (MessageRouterLive — the central dispatch)
  |
  +---> AppLayer = merge(all of the above)
          |
          +---> program (startServer + shutdown handler)
                  |
                  +---> Effect.provide(AppLayer)
                  +---> Effect.provide(LoggerLive)
Every service is a Context.Tag with a corresponding Live implementation. No service imports another service directly — dependencies flow through the Layer graph, making them explicit, testable, and replaceable.

Data Flow

The complete path of a client message through the gateway:
1

Wire

Client sends JSON over WebSocket. Bun.serve receives raw bytes and invokes the message() handler.
2

Validation

The raw string is parsed as JSON, then validated against ClientMessage using Schema.decodeUnknownEither. Invalid messages are rejected with an error event — they never reach business logic.
3

Authentication Check

The server verifies that the connection has completed authentication (the authenticated flag on WsData). Unauthenticated connections can only send authenticate messages.
4

Rate Limiting

A per-connection sliding window rate limiter checks whether the client has exceeded 60 messages per 10-second window. Rate-limited messages are rejected with RATE_LIMITED.
5

Routing

MessageRouter.route(identity, message) dispatches to the appropriate handler based on message.type. The router returns a RouteResult: either respond (send to this client), broadcast (publish to session topic), or none (no response needed).
6

Execution

The handler performs its work — querying the registry, creating a Podium connection, reserving billing credits, writing to SQLite, or forwarding to an upstream service.
7

Response

For respond results, the event is sent directly to the requesting client. For session-mutating operations (create, archive, delete, rename), the event is also broadcast to the tenant topic so other clients can update their session lists.
8

Streaming

For run_turn, the response is broadcast — all events from the agent are streamed to the session topic via the Broadcaster. The startEventStreamFiber consumes the Podium event stream, maps each event through PodiumEventMapper, publishes to the session topic, persists important events to SQLite, and dispatches to specialized event handlers for state management.

Per-Tenant Isolation

Diminuendo enforces tenant isolation at three levels:
  1. Authentication — every JWT contains a tenant_id claim. The gateway extracts this during authentication and attaches it to the connection’s identity. All subsequent operations are scoped to this tenant.
  2. Storage — each tenant has its own SQLite registry database at data/tenants/{tenantId}/registry.db. Session databases are stored at data/sessions/{sessionId}/session.db. A session can only be accessed by providing a sessionId that exists in the requesting tenant’s registry.
  3. Pub/Sub — topic subscriptions are namespaced by tenant. A client authenticated as tenant acme subscribes to tenant:acme:sessions; it will never receive events intended for tenant globex.
The sessionId is a UUID, so guessing a valid session ID across tenants is computationally infeasible. However, the gateway does not rely on this obscurity — it performs an explicit tenant ownership check on every session access.

What’s Next