Skip to main content
LINEAR16 (16KHz sampling rate) over the mono channel. Audio should be chunked into 100ms packets for optimal performance. We only support 100ms audio chunks to achieve the right balance between quality, latency, and efficiency.
Network connection speed and consistency are important for Suki to perform well.Suki requires:
  • Upload speed: 1Mbps
  • Bitrate: 768kbps
  • Ping time: 150ms
  • Unloaded latency: <50ms
  • Loaded latency: <150ms
Client should set up a WebSocket Secure (wss://) request with the Suki endpoint. Refer to the Audio stream API for implementation details.
For Ambient WebSocket /ws/stream message format, send LINEAR16, 16 kHz, mono audio, in about 100 ms chunks (see the Audio streaming reference and Ambient audio streaming guide).

How messages are sent

Each outbound message from the client must be:
  • One WebSocket text frame (UTF-8) with one JSON object inside
  • One logical send per frame (do not pack multiple JSON objects in one frame)
On /ws/stream, do not send PCM as binary WebSocket frames. Do not stream raw audio over HTTP with Content-Type: application/json.
Field names (proto-style JSON)
  • START_TIME and AUDIO: use type and data
  • data: standard Base64 (RFC 4648) of the raw bytes you mean to send (same idea as Go encoding/json for []byte). Do not use hex, URL-safe Base64, or raw binary inside the JSON string.
  • EVENT: use type: "EVENT" and the event field. Do not put the action name in data.

Message order (each stream segment)

  1. Send START_TIME once: data is Base64 of a UTF-8 RFC 3339 timestamp (for example 2026-04-25T12:34:56Z).
  2. Send one or more AUDIO messages: data is Base64 of each raw PCM chunk.
  3. Send a final AUDIO to end audio: data is RU9G (Base64 of ASCII EOF, bytes 0x45, 0x4F, 0x46). Do not use a separate end_of_stream type unless your integration team tells you otherwise.
EVENT messages can go anywhere in the stream when you need control (pause, resume, keep-alive, cancel, abort).

EVENT values you can send

Use {"type":"EVENT","event":"<VALUE>"} with one of:
ValueWhat it does
PAUSEPause the stream
RESUMEResume the stream
CANCELUser cancels the stream
ABORTStream is aborted (interruption)
KEEP_ALIVEKeep the connection alive during inactivity. While paused, send at least once every five seconds so the server does not close the connection.

Message examples

{
  "type": "START_TIME",
  "data": "<base64(UTF-8 RFC 3339 timestamp, e.g. 2026-04-25T12:34:56Z)>"
}
The stream end marker is an AUDIO message whose data is Base64 of the raw bytes EOF, not a bare JSON string "EOF" and not an EVENT named EOF, unless your stack documentation says otherwise.
Last modified on May 22, 2026