Overview
The Suki Dictation SDK is a JavaScript and React library for in the browser. Use Dictation SDK to add speech-to-text capabilities to your web application.If you use our Headed Web SDK
v3.X.X+, you can use audio dictation capabilities out of the box. Refer to the Web SDK for audio dictation guide for more details.In-field Mode
Use this mode for dictation over a target field. Commonly used for notes and forms.
Scratchpad Mode
Use this mode for a floating dictation panel, not tied to one input. Commonly used for dictation in a separate area of the page.
- Recording begins automatically
- Transcription appears inside the text field and the scratchpad UI
- When the user finishes dictation and confirms, the SDK calls the handler functions you registered and passes the final transcript into your app so you can update fields, save data, or run your own logic
Dictation SDK is not the same as calling the Dictation APIs directly (REST and WebSocket from your code). The SDK manages SDP authentication (
SukiAuthManager), iframe lifecycle, and callbacks. For custom or server-side integrations without the iframe, use the APIs described in Audio dictation.Supported packages
The Dictation SDK is available in the following packages:React
Use
@suki-sdk/dictation-react for React web applications.Install the Dictation SDK for React:Vanilla JavaScript
Use
@suki-sdk/dictation for vanilla JavaScript applications.Install the Dictation SDK for JavaScript:Both require shared authentication through
@suki-sdk/core (Refer to the Dictation SDK Quickstart for more information).When to use Dictation SDK
Choose the Dictation SDK when:- You want real-time speech-to-text in the browser with Suki’s hosted iframe experience
- You need in-field overlays on inputs or a scratchpad workflow for dictation mechanics
- You prefer callback-driven commit of the transcription result over owning REST and WebSocket dictation calls yourself
Recommended integration pattern
Follow the below pattern to integrate the Dictation SDK into your application:How it works
The following diagram illustrates the Dictation SDK architecture and workflow:Architecture workflow
Your web application
- You choose when dictation should open and where it should appear on the page (for example over one input, or as a separate floating area).
- Your code starts dictation and listens for results from the SDK. When the user finishes, you receive the final text.
- Your UI decides what to do with that text (fill a field, save a draft, discard, and so on).
Suki services
- Sign-in: Suki checks your configuration, signs in with the partner account details you provide, keeps access fresh, and hands the hosted dictation experience what it needs to run securely.
- Hosted dictation: The microphone experience and transcription run in an embedded view that Suki hosts. That way the dictation screen looks and behaves the same in every app, and it does not inherit your site’s fonts, colors, or layout.
Typical session
- Set up sign-in once using the partner details Suki gives you.
- Add the dictation SDK to your app and connect it to that sign-in.
- When the user asks to dictate, open dictation in the place you picked (over a field or as a scratchpad-style area).
- The user speaks and confirms. Your app receives the text so you can put it where it belongs in your workflow.
- When dictation should stop, close it from your app so the session ends.
Refer to the Architecture guide for more details.
Next steps
Get started by following these steps:Install the Dictation SDK
Follow the Installation guide to add the packages to your project
Review prerequisites
Confirm browser, CSP, and layout requirements in the Prerequisites guide
Run the quickstart
Complete the Quickstart for JavaScript and React examples through your first
show() call