Cartesia (Beta)
Cartesia provides ultra-fast, low-latency text-to-speech with real-time streaming capabilities, optimized for interactive conversational AI applications.
Sample configuration
The following example shows a starting tts parameter configuration you can use when you Start a conversational AI agent.
The parameters listed on this page are validated for use with Conversational AI Engine. Required parameters must be provided as documented. Any additional parameters are passed through directly to the underlying vendor without validation. For a full list of supported options, refer to the Cartesia TTS documentation.
Key parameters
paramsrequired
- api_key stringrequired
The Cartesia API key used to authenticate requests. Get your API key from the Cartesia Console.
- model_id stringrequired
The identifier of the TTS model to use. For example,
sonic-2. - base_url stringnullable
The WebSocket URL for the Cartesia streaming API. For example,
wss://api.cartesia.ai. - output_format objectnullable
Audio output format configuration
Show propertiesHide properties
- container stringnullable
Audio container format for the output stream.
- sample_rate numbernullable
Default:
16000Possible values:
8000,16000,22050,24000,44100,48000Audio sampling rate in Hz
- language stringnullable
Target language for speech synthesis.