Cartesia (Beta)
Cartesia provides ultra-fast, low-latency text-to-speech with real-time streaming capabilities, optimized for interactive conversational AI applications.
Sample configuration
The following example shows a starting tts
parameter configuration you can use when you Start a conversational AI agent.
Key parameters
paramsrequired
- api_key stringrequired
The API key used for authentication. Get your API key from the Cartesia Console.
- model_id stringrequired
Identifier of the model to be used.
- voice objectrequired
Voice configuration object.
Show propertiesHide properties
- mode stringrequired
Voice selection mode. Use
id
to select by voice identifier. - id stringrequired
The identifier of the selected voice for speech synthesis.
- output_format objectnullable
Audio output format configuration
Show propertiesHide properties
- container stringnullable
Audio container format for the output stream.
- sample_rate numbernullable
Default:
16000
Possible values:
8000
,16000
,22050
,24000
,44100
,48000
Audio sampling rate in Hz
- language stringnullable
Target language for speech synthesis.
For advanced configuration options, voice customization, and detailed parameter descriptions, see the Cartesia TTS documentation.