Error Handling
Diminuendo employs a layered error handling strategy. Transport-level errors (malformed JSON, oversized messages, rate limiting) are caught at the WebSocket handler before any business logic executes. Domain-level errors (authentication failures, session not found, insufficient credits) are raised within the Effect TS runtime and mapped to structured error events. All errors are sanitized before reaching clients — stack traces are stripped, API keys are redacted, and messages are truncated to a safe length.Error Event Format
All errors are delivered as a server event withtype: "error":
Always
"error".A machine-readable error code. Use this for programmatic error handling (switch statements, retry logic, UI rendering).
A human-readable description. Safe for display to end users — guaranteed to contain no stack traces, no API keys, and no internal implementation details.
Error events are not session-scoped — they do not carry
sessionId, seq, or ts fields. They are sent directly to the connection that triggered the error, never broadcast to session subscribers.Error Codes
Transport-Level Errors
These errors are raised at the WebSocket handler layer before the message reaches the business logic:| Code | Condition | Description |
|---|---|---|
INVALID_JSON | Message is not valid JSON | The raw WebSocket frame could not be parsed as JSON. Typically caused by incomplete messages, binary data sent to a text frame, or encoding errors. |
INVALID_MESSAGE | JSON does not match any schema | The JSON was parsed successfully but does not match any of the 21 client message schemas. Common causes: unknown type field, missing required fields, wrong field types. |
MESSAGE_TOO_LARGE | Raw message exceeds 1 MB | The message was rejected before parsing. The gateway enforces a 1 MB size limit on inbound WebSocket frames. |
RATE_LIMITED | Per-connection rate limit exceeded | More than 60 messages in a 10-second sliding window. The client must slow down. |
AUTH_RATE_LIMITED | Authentication attempts exceeded | Too many authenticate messages from the same IP address in a short period. Includes a retryAfterMs hint in the message text. |
Authentication & Authorization Errors
| Code | Condition | Description |
|---|---|---|
NOT_AUTHENTICATED | Message sent before authentication | The client attempted to send a message (other than authenticate) before completing authentication. |
AUTH_FAILED | Token validation failed | The JWT or API key was invalid, expired, or could not be verified against the Auth0 JWKS endpoint. |
Unauthenticated | Authentication required | A gateway-internal error indicating the operation requires authentication. |
Unauthorized | Insufficient permissions | The authenticated user does not have the required RBAC permission for this operation (e.g., a member attempting manage_members with set_role action). |
Domain Errors
| Code | Condition | Description |
|---|---|---|
SessionNotFound | Session does not exist | The referenced sessionId does not exist in the tenant’s registry, or the session has been deleted. |
SessionAlreadyExists | Duplicate session creation | Attempted to create a session with an ID that already exists (rare — UUIDs are generated server-side). |
INSUFFICIENT_CREDITS | Billing check failed | The tenant does not have enough credits to start a turn. No agent interaction occurs. |
PodiumConnectionError | Agent connection failed | Failed to establish or maintain a WebSocket connection to the Podium agent orchestrator. |
PodiumTimeout | Agent operation timed out | A Podium operation (instance creation, message send) exceeded its timeout threshold. |
EnsembleError | LLM inference failure | An error from the Ensemble inference service (model unavailable, provider error, token limit exceeded). |
SandboxNotConfigured | No sandbox available | The operation requires a sandbox environment, but none is configured for this session. |
DbError | Database operation failed | A SQLite operation failed (disk full, corruption, concurrent write conflict). |
ProtocolVersionMismatch | Version mismatch | The client and gateway disagree on the protocol version. |
INTERNAL_ERROR | Unclassified error | A catch-all for unexpected failures. The message is sanitized before delivery. |
Turn-Specific Errors
Turn errors are delivered via theturn_error event type (not the generic error event) because they are session-scoped and carry seq/ts fields:
| Code | Condition | Description |
|---|---|---|
AGENT_ERROR | Agent reported an error | The Podium agent encountered an error during execution (e.g., LLM API failure, tool crash). |
AGENT_DISCONNECTED | WebSocket dropped mid-turn | The gateway lost its WebSocket connection to the Podium agent while a turn was in progress. The session transitions to error state. |
Member Management Errors
| Code | Condition | Description |
|---|---|---|
INVALID_MEMBER_UPDATE | Missing or invalid fields | The set_role action was missing userId or role, or the role is not a valid value. |
LAST_OWNER_PROTECTED | Cannot demote last owner | The operation would leave the tenant with no owners. Promote another member to owner first. |
Error Sanitization
All error messages pass through a sanitization pipeline before reaching clients. This is a critical security measure — internal errors may contain stack traces, file paths, API keys, or other sensitive information that must never be exposed to end users. The sanitization process applies three transformations in order:1
Strip Stack Traces
Any line matching the pattern of a stack trace (
at Function.name (/path/to/file.ts:42:10)) is removed. This prevents leaking internal file paths and call stacks.2
Redact Secrets
The following patterns are replaced with
[REDACTED]:- Anthropic API keys:
sk-ant-* - Generic secret keys:
sk-* - GitHub personal access tokens:
ghp_* - Bearer tokens:
Bearer * - URL token parameters:
token=*
3
Truncate
Messages exceeding 500 characters are truncated with an ellipsis (
...). This prevents pathologically long error messages (e.g., from serialized stack traces or large JSON payloads) from bloating WebSocket frames.Gateway Typed Errors
Internally, the gateway uses Effect’sTaggedError pattern to define typed, structured error classes. These provide type-safe error handling within the Effect runtime and are mapped to error codes when sent to clients.
Full Error Type Catalog
Full Error Type Catalog
| Error Class | Tag | Fields | Wire Code |
|---|---|---|---|
Unauthenticated | "Unauthenticated" | reason: string | Unauthenticated |
Unauthorized | "Unauthorized" | tenantId: string, resource: string | Unauthorized |
SessionNotFound | "SessionNotFound" | sessionId: string | SessionNotFound |
SessionAlreadyExists | "SessionAlreadyExists" | sessionId: string | SessionAlreadyExists |
InsufficientCredits | "InsufficientCredits" | tenantId: string, required: number, available: number | INSUFFICIENT_CREDITS |
PaymentFailed | "PaymentFailed" | reason: string, stripeError?: string | PaymentFailed |
PodiumConnectionError | "PodiumConnectionError" | message: string, cause?: unknown | PodiumConnectionError |
PodiumTimeout | "PodiumTimeout" | operation: string, timeoutMs: number | PodiumTimeout |
EnsembleError | "EnsembleError" | message: string, statusCode?: number | EnsembleError |
SandboxNotConfigured | "SandboxNotConfigured" | message: string | SandboxNotConfigured |
DbError | "DbError" | message: string, cause?: unknown | DbError |
InvalidMessage | "InvalidMessage" | reason: string, raw?: string | InvalidMessage |
ProtocolVersionMismatch | "ProtocolVersionMismatch" | expected: number, received: number | ProtocolVersionMismatch |
_tag field of each TaggedError to the wire code, and provides safe, predefined messages for known error types:sanitizeErrorMessage before being sent to the client.Recovery Strategies
Different error codes call for different recovery approaches. The table below provides guidance for client implementations:| Error Code | Strategy | Details |
|---|---|---|
AUTH_FAILED | Re-authenticate | The token may be expired. Obtain a fresh JWT from your identity provider and send a new authenticate message. |
AUTH_RATE_LIMITED | Exponential backoff | Parse the retry delay from the error message. Wait at least that long before attempting authentication again. |
NOT_AUTHENTICATED | Re-authenticate | The connection may have been reset. Send an authenticate message before retrying the original operation. |
Unauthorized | Surface to user | The user lacks the required permission. Display an appropriate message and do not retry. |
SessionNotFound | Refresh session list | The session may have been deleted by another client or a concurrent operation. Call list_sessions to refresh the UI. |
INSUFFICIENT_CREDITS | Surface to user | The tenant has exhausted its credits. Direct the user to the billing interface to add credits before retrying. |
RATE_LIMITED | Backoff and retry | Reduce message frequency. Implement a client-side rate limiter to stay within 60 messages per 10 seconds. |
PodiumConnectionError | Retry with backoff | The agent backend may be temporarily unavailable. Wait 2-5 seconds and retry the operation. If persistent, the agent infrastructure may be down. |
AGENT_DISCONNECTED | Offer retry | The agent’s WebSocket connection dropped mid-turn. The session is in error state. The user can retry the turn with run_turn. |
AGENT_ERROR | Offer retry | The agent encountered an internal error. The user can retry the turn. If the error persists, it may indicate a problem with the agent’s configuration or the underlying LLM. |
INTERNAL_ERROR | Retry once, then surface | An unexpected error. Retry the operation once. If it fails again, surface the error to the user and consider reconnecting. |
MESSAGE_TOO_LARGE | Reduce payload | The message exceeds 1 MB. Reduce the size of the content (e.g., truncate very long prompts). |
INVALID_JSON | Fix client bug | The client is sending malformed JSON. This is always a client-side bug. |
INVALID_MESSAGE | Fix client bug | The message structure is wrong. Check field names, types, and required fields against this documentation. |
The SDKs handle several of these recovery strategies automatically. The TypeScript SDK’s
autoReconnect option re-establishes the connection and re-authenticates on disconnect. The Python SDK’s auto_reconnect does the same. For session-level recovery (re-joining with afterSeq), the client application must implement its own logic — the SDKs provide the primitives but do not make assumptions about which sessions to rejoin.