Skip to main content

ElevenLabs

ElevenLabs provides highly realistic AI voices with advanced prosody and natural speech patterns, delivering lifelike audio synthesis with emotional nuance and conversational flow.

Paid plan required

You need a paid ElevenLabs plan for reliable TTS integration.

ElevenLabs may restrict or disable free-tier accounts due to abuse-detection mechanisms, even if free credits are available. To avoid missing audio responses during testing and production, ensure you use a paid plan.

Sample configuration

The following example shows a starting tts parameter configuration you can use when you Start a conversational AI agent.


_10
"tts": {
_10
"vendor": "elevenlabs",
_10
"params": {
_10
"base_url": "wss://api.elevenlabs.io/v1",
_10
"key": "<your_elevenlabs_key>",
_10
"model_id": "eleven_flash_v2_5",
_10
"voice_id": "pNInz6obpgDQGcFmaJgB",
_10
"sample_rate": 24000
_10
}
_10
}

Key parameters

paramsrequired
  • base_url stringrequired

    The endpoint URL for the OpenAI TTS service. See Data residency.

  • key stringrequired

    The API key used for authentication. Get your API key from the ElevenLabs Console.

  • model_id stringrequired

    Identifier of the model to be used. Popular options include eleven_flash_v2_5 for speed or eleven_multilingual_v2 for quality.

  • voice_id stringrequired

    The identifier for the selected voice for speech synthesis. Browse available voices in the Voice Library.

  • sample_rate numbernullable

    Default: 24000

    Audio sampling rate in Hz. Common values: 16000, 22050, 24000, 44100.

  • speed numbernullable

    Default: 1.0

    Speed up or slow down the speed of the generated speech. Range 0.7 to 1.2 inclusive.

  • stability numbernullable

    Controls voice stability. Higher values (0.8-1.0) produce more consistent speech, lower values (0.0-0.5) add more variation.

  • similarity_boost numbernullable

    Enhances similarity to the original voice. Range: 0.0-1.0. Higher values stick closer to the training voice.

  • style numbernullable

    Controls speaking style and expressiveness. Higher values increase emotional range and variation.

  • use_speaker_boost booleannullable

    Improves voice quality and similarity when enabled. Recommended for most use cases.

For advanced configuration options, voice cloning, and detailed parameter descriptions, see the ElevenLabs TTS documentation.