Websocket API
Fastpaca exposes a backend-only websocket for watching context updates in near real-time. Use it to trigger compaction workers, fan out updates to other services, or maintain UI state via your own gateway.
Endpoint:
ws://HOST/v1/contexts/:id/stream
Connection parameters
| Query parameter | Description |
|---|---|
cursor (optional) | The last version you processed. Pass 0 to receive everything from the beginning. |
include_messages (default true) | Set to false if you only care about context/compaction updates. |
Example:
ws://localhost:4000/v1/contexts/support-123/stream?cursor=120
If authentication is enabled, include the API key via the Authorization header (Bearer …).
Disable message notifications with ?include_messages=false if you only need compaction/context signals.
Message format
All messages are JSON objects with type and version. Version numbers are strictly increasing.
Message notifications
{
"type": "message",
"version": 121,
"seq": 121,
"message": {
"role": "user",
"parts": [{ "type": "text", "text": "Any updates?" }],
"inserted_at": "2025-01-24T12:00:00Z"
}
}
Sent whenever a new message is appended to the context.
Context updates
{
"type": "context",
"version": 121,
"needs_compaction": false,
"used_tokens": 512340
}
Indicates that the cached LLM context should be refreshed via GET /v1/contexts/:id/context.
Compaction acknowledgements
{
"type": "compaction",
"version": 122,
"range": { "from_seq": 1, "to_seq": 80 }
}
Emitted after a successful /compact call.
Tombstone notice
{ "type": "tombstoned", "version": 0 }
The context has been deleted. The server closes the connection after sending this message.
Snapshot reset
{ "type": "reset", "version": 200 }
The snapshot was rebuilt (e.g., after a manual repair). Clients should discard cached state and fetch a fresh LLM context.
Heartbeats & timeouts
- The server sends a
{"type":"ping"}heartbeat every 30 seconds. - Clients should respond with
{"type":"pong"}to keep the connection alive. - Idle connections without heartbeats for 90 seconds are closed.
Reconnect logic
- Keep track of the highest
versionyou've processed. - On reconnect, pass that value as
cursor. - If the server responds with
{"type":"gap","expected":...,"actual":...}immediately fetch the missing messages viaGET /v1/contexts/:id/messagesand resume with the returnedversion.
Limits
- The websocket is intended for backend-to-backend use. Do not expose it directly to browsers.
- To mirror updates to clients, fan out through your own gateway (e.g., WebSocket, SSE, or Pub/Sub).
- Maximum concurrent connections per node are configurable; defaults to 512.