Skip to main content
GET
/
ws
/
transcribe
Streams audio to the transcription service via WebSocket.
curl --request GET \
  --url https://sdp.suki.ai/ws/transcribe \
  --header 'sdp_suki_token: <sdp_suki_token>' \
  --header 'transcription_session_id: <transcription_session_id>'
"<string>"

Documentation Index

Fetch the complete documentation index at: https://developer.suki.ai/llms.txt

Use this file to discover all available pages before exploring further.

Use this WebSocket endpoint to stream audio to an active session for real-time transcription. For the complete workflow, usage guidelines, wire format, message order, and error handling, refer to the Audio dictation and Dictation streaming guides.
  • For a single guide that compares /ws/stream and /ws/transcribe (handshake, proxy behavior, and message shapes), refer to Audio streaming guide.
  • Stream audio in chunks for the best latency and throughput.
  • For partial and final inbound transcript frames, EOF, and session state rules, refer to Dictation streaming.
If the dictation session is RUNNING, COMPLETED, or in another state, the WebSocket handshake fails with FailedPrecondition (for example transcript session is not accepting new speech sessions).

Inbound transcript messages

The server sends JSON text frames that include transcript, is_final, and transcript_id. Use is_final to identify whether the transcript is a partial result or a final result. After the audio stream ends, the server sends { "transcript": { "transcript": "EOF" } } and then closes the WebSocket connection. Refer to Dictation streaming for frame examples, words and speaker IDs on finals, and client-side filtering rules.

Authentication

Authentication is applied during the WebSocket handshake. The method depends on your client type. Use Sec-WebSocket-Protocol header for browser clients, and sdp_suki_token and transcription_session_id headers for non-browser clients.

Browser clients

If you are connecting from a browser, you must use the Sec-WebSocket-Protocol header during the WebSocket handshake. The header must specify the SukiAmbientAuth protocol, followed by the token and the transcription session ID in the following format.
Sec-WebSocket-Protocol: SukiAmbientAuth,<sdp_suki_token>,<transcription_session_id>

Non-browser clients

If you are connecting from a non-browser client, such as a mobile or server-side application, you must provide the token and session ID as separate HTTP headers in the initial WebSocket upgrade request.
  • sdp_suki_token: Your Suki token.
  • transcription_session_id: The ID for the current session.
Important:
  • All messages must be sent as JSON text frames over the WebSocket connection.
  • Do not send raw binary data or use HTTP endpoints for streaming audio.

Code examples

# pip install websocket-client
import base64
import json
import websocket

# Replace with values from Create dictation session and your authentication flow.
transcription_session_id = "<transcription_session_id>"
sdp_suki_token = "<sdp_suki_token>"

# Staging WebSocket URL
ws_url = "wss://sdp.suki-stage.com/ws/transcribe"

ws = websocket.create_connection(
    ws_url,
    header=[
        f"sdp_suki_token: {sdp_suki_token}",
        f"transcription_session_id: {transcription_session_id}",
    ],
)

# If the file is WAV, skip the 44-byte header so payloads are PCM_S16LE only.
WAV_HEADER_BYTES = 44
CHUNK_BYTES = 3200  # Example chunk size; size your chunks to your capture pipeline.

try:
    with open("audio.wav", "rb") as audio_file:
        if WAV_HEADER_BYTES:
            audio_file.read(WAV_HEADER_BYTES)
        chunk = audio_file.read(CHUNK_BYTES)
        while chunk:
            msg = {
                "type": "AUDIO",
                "audioData": base64.b64encode(chunk).decode("ascii"),
            }
            ws.send(json.dumps(msg))
            chunk = audio_file.read(CHUNK_BYTES)
    ws.send(json.dumps({"type": "EVENT", "event": "AUDIO_END"}))
finally:
    ws.close()

Headers

Sec-WebSocket-Protocol
string

Required FOR BROWSER CLIENTS ONLY. Sent during WebSocket handshake. Browsers must use the same subprotocol the grpc-wsproxy maps to Authorization: 'SukiAmbientAuth,<sdp_suki_token>,<transcription_session_id>' (comma-separated; token second, transcription session id third). Other names (e.g. SukiTranscriptionAuth) are not mapped and typically yield 401.

sdp_suki_token
string
required

Required FOR NON-BROWSER CLIENTS ONLY: The Suki access token. Sent as a standard header with the initial upgrade request.

transcription_session_id
string
required

Required FOR NON-BROWSER CLIENTS ONLY: The transcription session ID. Sent as a standard header with the initial upgrade request.

Response

Switching Protocols - Indicates successful WebSocket handshake.

The response is of type string.

Last modified on May 22, 2026