Session State Machine

Every agent session in Diminuendo passes through a rigorous, finite-state lifecycle. Simple status strings — "idle", "running" — cannot enforce valid transitions. A session stuck in "running" after a crash has no recovery path. Clients display stale state. The gateway needs a formal model: a set of states, a set of legal transitions, and pure functions that compute the next state from the current state and an incoming agent signal. Diminuendo’s session state machine defines seven states, an explicit transition guard map, and a deterministic mapping from agent-reported status values to state transitions. This model was ported from the Crescendo desktop client’s connection-state.ts and adapted for server-side enforcement.

The Seven States

          +--------------------------------------------------+
          |                                                  |
          v                                                  |
    +-----------+         +------------+         +---------+ |
    | inactive  | ------> | activating | ------> |  ready  | |
    +-----------+         +------------+         +---------+ |
          ^                  |    |                |   |   |  |
          |                  |    |                |   |   |  |
          |                  v    |                v   |   |  |
          |              +-------+|          +---------+  |  |
          |              | error  |          | running |  |  |
          |              +-------+           +---------+  |  |
          |                  |                 |   |       |  |
          |                  |                 |   |       |  |
          |                  v                 v   |       |  |
          |              +-----------+    +---------+     |  |
          +--------------| deactiv.  |    | waiting |     |  |
                         +-----------+    +---------+     |  |
                              ^               |           |  |
                              +---------------+-----------+--+
1

inactive

No Podium connection exists. The session is metadata only — a row in the tenant’s SQLite registry. This is the resting state for sessions that have been created but not yet activated, or that have been explicitly torn down.
2

activating

The gateway is creating a Podium agent instance and establishing a WebSocket connection to it. This is a transient state that resolves to either ready (on success), error (on failure), or inactive (if the activation was cancelled before completion).
3

ready

The Podium connection is established and the agent is idle, awaiting user input. From here, the session can begin processing a turn (running), be torn down (deactivating), be returned to inactive, or encounter a failure (error).
4

running

The agent is actively processing a turn — streaming text, invoking tools, or performing multi-step reasoning. This state persists until the turn completes (ready), the agent requests user interaction (waiting), a failure occurs (error), or a tear-down is initiated (deactivating).
5

waiting

The agent is blocked on user interaction. This occurs when the agent issues a question_requested or permission_requested event. The session remains in waiting until the user responds (transitioning back to running), a failure occurs (error), or the session is torn down (deactivating).
6

deactivating

Tear-down is in progress. The Podium instance is being stopped and the WebSocket connection is being closed. This resolves to inactive on success or error if the tear-down itself fails.
7

error

An unrecoverable failure has occurred. The session can only transition to inactive (resetting to a clean slate) or activating (attempting to reconnect). There is no path from error back to ready or running without passing through one of these two recovery states.

Transition Guard Map

The VALID_TRANSITIONS constant defines the complete set of legal state transitions as a Record<SessionState, ReadonlySet<SessionState>>. Any transition not present in this map is rejected:
export const VALID_TRANSITIONS: Record<SessionState, ReadonlySet<SessionState>> = {
  inactive:     new Set(["activating"]),
  activating:   new Set(["ready", "error", "inactive"]),
  ready:        new Set(["running", "deactivating", "inactive", "error"]),
  running:      new Set(["ready", "waiting", "error", "deactivating"]),
  waiting:      new Set(["running", "error", "deactivating"]),
  deactivating: new Set(["inactive", "error"]),
  error:        new Set(["inactive", "activating"]),
}
Expressed as a table:
FromAllowed Targets
inactiveactivating
activatingready, error, inactive
readyrunning, deactivating, inactive, error
runningready, waiting, error, deactivating
waitingrunning, error, deactivating
deactivatinginactive, error
errorinactive, activating
There is no transition from error to ready or running. Recovery from an error state always requires passing through inactive or activating first. This prevents the gateway from silently resuming a session whose underlying Podium connection may be in an unknown state.

Agent Status Mapping

The applySessionTransition function is a pure function that computes the next session state from the current state and an agent-reported status. It returns null if the transition would be invalid, allowing the caller to log and skip the transition rather than silently applying it.
export function applySessionTransition(
  current: SessionState,
  agentStatus: AgentStatus,
): SessionState | null {
  const next = agentStatusToState(current, agentStatus)
  if (next === null) return null
  if (!VALID_TRANSITIONS[current].has(next)) return null
  return next
}
The ten recognized agent status values map to session states as follows:
Agent StatusTarget StateNotes
createdactivatingPodium instance created, connecting
connectedreadyWebSocket handshake complete
turn_startedrunningAgent began processing
turn_completereadyTurn finished successfully
turn_errorready or errorready if currently running or waiting; error otherwise
question_requestedwaitingAgent needs user input
approval_resolvedrunningUser responded to question/permission
terminatingdeactivatingGraceful shutdown initiated
terminatedinactiveShutdown complete
errorerrorUnrecoverable failure
The turn_error status has context-dependent behavior: if the session is currently running or waiting, a turn error is recoverable (the session returns to ready). In any other state, a turn error is treated as an unrecoverable failure.

Enforcement: transitionSessionState

The transitionSessionState helper in MessageRouterLive.ts is the single point through which all state transitions flow. It validates the proposed transition against the guard map, updates the session’s ConnectionState ref, persists the new status to the registry database, and broadcasts the change to all subscribers:
const transitionSessionState = (
  tenantId: string,
  sessionId: string,
  cs: ConnectionState,
  newState: SessionState,
) =>
  Effect.gen(function* () {
    const current = yield* Ref.get(cs.sessionState)
    if (current !== newState && !VALID_TRANSITIONS[current].has(newState)) {
      yield* Effect.logWarning(
        `Invalid state transition for session ${sessionId}: ${current} -> ${newState} (rejected)`
      )
      return
    }
    yield* Ref.set(cs.sessionState, newState)
    yield* registry.updateStatus(tenantId, sessionId, newState).pipe(Effect.ignore)
    yield* broadcaster.tenantEvent(tenantId, {
      type: "session_updated",
      session: { id: sessionId, status: newState },
    })
  })
Invalid transitions are logged and skipped — they are never silently applied, and they never throw. This defensive posture ensures that a misbehaving upstream agent cannot drive the gateway’s state machine into an illegal configuration.

ConnectionState: Per-Connection Typed Refs

Each active session gets its own ConnectionState — a struct of Effect Ref values that track the full in-flight state of the session. This replaces the scattered, untyped state that accumulates in less structured architectures.
export interface ConnectionState {
  // Turn tracking
  readonly turnId: Ref.Ref<string | null>
  readonly fullContent: Ref.Ref<string>
  readonly stopRequested: Ref.Ref<boolean>
  readonly turnStopped: Ref.Ref<boolean>

  // Tool call tracking
  readonly pendingToolCalls: Ref.Ref<Map<string, { toolName: string; startedAt: number }>>
  readonly completedToolIds: Ref.Ref<Set<string>>
  readonly persistedToolCallIds: Ref.Ref<Set<string>>

  // Thinking
  readonly isThinking: Ref.Ref<boolean>
  readonly thinkingContent: Ref.Ref<string>

  // Interactive
  readonly deferredInteractiveMessage: Ref.Ref<DeferredInteractiveMessage | null>
  readonly pendingApproval: Ref.Ref<boolean>

  // Billing
  readonly currentReservation: Ref.Ref<CreditReservation | null>
  readonly lastContextUsage: Ref.Ref<ContextUsage | null>

  // Sequencing
  readonly messageIdStack: Ref.Ref<string[]>
  readonly acked: Ref.Ref<boolean>

  // Session state
  readonly sessionState: Ref.Ref<SessionState>
}

resetTurnState

At the start of each new turn, resetTurnState clears all turn-specific refs back to their initial values. This ensures no state leaks between turns:
export const resetTurnState = (cs: ConnectionState): Effect.Effect<void> =>
  Effect.gen(function* () {
    yield* Ref.set(cs.turnId, null)
    yield* Ref.set(cs.fullContent, "")
    yield* Ref.set(cs.stopRequested, false)
    yield* Ref.set(cs.turnStopped, false)
    yield* Ref.set(cs.pendingToolCalls, new Map())
    yield* Ref.set(cs.completedToolIds, new Set())
    yield* Ref.set(cs.persistedToolCallIds, new Set())
    yield* Ref.set(cs.isThinking, false)
    yield* Ref.set(cs.thinkingContent, "")
    yield* Ref.set(cs.deferredInteractiveMessage, null)
    yield* Ref.set(cs.pendingApproval, false)
    yield* Ref.set(cs.currentReservation, null)
    yield* Ref.set(cs.lastContextUsage, null)
    yield* Ref.set(cs.acked, false)
  })

Stale Session Recovery

After a gateway restart, no Podium connections survive. Any session that was in a non-idle state at the time of shutdown is stale by definition — its ConnectionState refs no longer exist, and its Podium WebSocket is gone. The reconcileStaleSessions function runs on startup for each tenant. It queries all sessions not in "inactive" state and resets them:
export const reconcileStaleSessions = (tenantId: string) =>
  Effect.gen(function* () {
    const registry = yield* SessionRegistryService
    const staleSessions = yield* registry.listNonIdle(tenantId)

    if (staleSessions.length === 0) {
      yield* Effect.logDebug(`StaleRecovery: no stale sessions for tenant ${tenantId}`)
      return { recovered: 0, total: 0 }
    }

    yield* Effect.forEach(
      staleSessions,
      (session) =>
        Effect.gen(function* () {
          yield* registry.updateStatus(tenantId, session.id, "inactive").pipe(Effect.ignore)
          yield* Effect.logInfo(
            `StaleRecovery: reset session ${session.id} from "${session.status}" to "inactive"`
          )
        }),
      { concurrency: 5 },
    )
    // ...
  })
Stale session recovery runs with a concurrency of 5 to avoid overwhelming the SQLite writer on startup when many sessions need to be reset. Each reset is idempotent — running it twice has no additional effect.

Legacy State Migration

The state machine also provides a migration path from the earlier 4-state model (idle, running, awaiting_question, error) to the current 7-state model:
export function migrateLegacyStatus(legacy: string): SessionState {
  switch (legacy) {
    case "idle":               return "inactive"
    case "running":            return "running"
    case "awaiting_question":  return "waiting"
    case "error":              return "error"
    default:                   return "inactive"
  }
}
The protocol schema also accepts legacy status values on read ("idle" and "awaiting_question") for backwards compatibility, though new code never emits them.

Comparison with Crescendo

This state machine was ported from the Crescendo desktop client’s connection management layer. The key differences in the Diminuendo server-side implementation:

Server-Side Enforcement

In Crescendo, the state machine runs client-side in a Tauri process. Invalid transitions are visible only in local logs. In Diminuendo, the state machine is enforced server-side — all clients see the same authoritative state, and invalid transitions are rejected before they propagate.

Persistent State

Crescendo holds state in memory only. Diminuendo persists the current session state to SQLite on every transition, enabling stale session recovery after restarts and consistent state across reconnections.

Multi-Client Broadcast

Crescendo manages a single user’s view. Diminuendo broadcasts state transitions to all subscribers of a session (via Bun pub/sub), ensuring that dashboards, CLIs, and web clients all see transitions in real time.

Billing Integration

Diminuendo’s state transitions are tightly coupled with the billing system. A credit reservation is created when entering running and settled when transitioning to ready (on success) or error (on failure). Crescendo has no billing integration.