Cartesia (Beta)
Cartesia provides ultra-fast, low-latency text-to-speech with real-time streaming capabilities, optimized for interactive conversational AI applications.
Sample configuration
The following example shows a starting tts parameter configuration you can use when you Start a conversational AI agent.
The parameters listed on this page are validated for use with Conversational AI Engine. Required parameters must be provided as documented. Any additional parameters are passed through directly to the underlying vendor without validation. For a full list of supported options, refer to the Cartesia TTS documentation.
Key parameters
paramsrequired
- api_key stringrequired
The API key used for authentication. Get your API key from the Cartesia Console.
- model_id stringrequired
Identifier of the model to be used.
- voice objectrequired
Voice configuration object.
Show propertiesHide properties
- mode stringrequired
Voice selection mode. Use
idto select by voice identifier. - id stringrequired
The identifier of the selected voice for speech synthesis.
- output_format objectnullable
Audio output format configuration
Show propertiesHide properties
- container stringnullable
Audio container format for the output stream.
- sample_rate numbernullable
Default:
16000Possible values:
8000,16000,22050,24000,44100,48000Audio sampling rate in Hz
- language stringnullable
Target language for speech synthesis.