Performance Benchmark

Both Diminuendo and Crescendo connect to the same Podium (agent orchestrator) and Ensemble (LLM inference). Since agent processing time is constant across both gateways, the measured delta is the gateway overhead.
All benchmarks run locally on the same machine with shared backends. 10 warmup iterations are discarded before measurement begins.

Test Environment

ServicePortNotes
Podium Gateway:5083Shared — both gateways route here
Podium Coordinator:5082Shared
Ensemble:5180Shared
Crescendo:8002Next.js on Bun (dev/turbo)
Diminuendo:8080Bun + Effect TS

Results

Health Endpoint

100 iterations, 10 warmup
DiminuendoCrescendoSpeedup
p500.6ms5.0ms8.4x faster
p951.1ms7.5ms6.8x faster
p991.4ms10.3ms7.3x faster
mean0.7ms5.6ms8.0x faster
stddev0.3ms1.6ms5.3x tighter
RPS10,39029135.7x throughput
Crescendo checks 4 dependencies (PostgreSQL, Redis, Ensemble, Podium). Diminuendo checks 2 (Ensemble, Podium). Even accounting for 2 fewer sub-millisecond probes, the dominant cost is Next.js per-request middleware and routing overhead.

Connection & Auth

20 iterations
DiminuendoCrescendoSpeedup
p500.4ms5.5ms15.7x faster
p950.5ms8.5ms17.0x faster
Diminuendo establishes a WebSocket and auto-authenticates in dev mode with zero I/O. Crescendo sends POST /api/e2e/seed which requires a PostgreSQL upsert round-trip.

Session / Thread Creation

50 iterations, 10 warmup
DiminuendoCrescendoSpeedup
p500.6ms17.7ms27.6x faster
p950.9ms24.8ms27.6x faster
p990.9ms51.9ms57.7x faster
mean0.7ms19.1ms27.3x faster
stddev0.1ms8.9ms89x less variance
min0.5ms10.9ms
max0.9ms75.9ms
Diminuendo’s sub-millisecond consistency (stddev 0.1ms) comes from in-process SQLite writes. Crescendo’s variance (stddev 8.9ms, max 75.9ms) reflects PostgreSQL network round-trips and Redis publish fan-out.

Summary

MetricDiminuendoCrescendoDiminuendo advantage
Health p500.6ms5.0ms8.4x faster
Health RPS10,39029135.7x throughput
Auth/connect p500.4ms5.5ms15.7x faster
Session create p500.6ms17.7ms27.6x faster
Session create p950.9ms24.8ms27.6x faster
Session create jitter0.1ms stddev8.9ms stddev89x less variance

Why Diminuendo Is Faster

Bun-native runtime

Bun’s native HTTP server + Effect TS vs Next.js middleware stack saves ~4ms per request.

WebSocket transport

Persistent connection eliminates per-request TCP handshake and cookie parsing. Auth is amortized to zero after the initial connect.

In-process SQLite

Zero-network writes save 10–15ms per database operation compared to PostgreSQL over TCP.

In-process pub/sub

Bun’s built-in publish/subscribe avoids the Redis network hop, saving 1–2ms per event.

Raw Backend Baselines

Direct health-check latency to the shared backends (50 iterations), for reference:
Backendp50p95
Podium0.37ms0.76ms
Ensemble0.24ms0.39ms
These are the floor — all gateway overhead is additive on top.

Reproducing

# Prerequisites: Podium on :5083, Ensemble on :5180, both gateways running
cd ~/Projects/gateway-bench
bun install
bun run bench                            # all scenarios
bun run bench -- --scenarios health      # just health
bun run bench -- --scenarios session-create
The benchmark script auto-detects whether services are running and starts them if needed.
Benchmark run: 2026-03-03 — Bun 1.3.10, macOS arm64