Skip to main content

Overview

Text-to-Speech (TTS) engines convert your AI agent's text responses into natural-sounding speech. They enable real-time voice interactions by synthesizing audio from the language model's output. Agora supports multiple TTS providers, allowing you to choose the best voice quality and characteristics for your specific requirements.

Integration steps

To integrate the TTS provider of your choice, follow these steps:

  1. Choose your TTS provider from the Supported TTS providers table
  2. Obtain an API key from the provider's console
  3. Copy the sample configuration for your chosen provider
  4. Replace the API key placeholder with your actual API key
  5. Select an appropriate voice and configure audio parameters
  6. Specify the configuration in the request body as properties > tts when Starting a conversational AI agent

Supported TTS providers

Conversational AI Engine currently supports the following TTS providers:

ProviderProvider's documentation
MicrosoftMicrosoft Azure TTS
ElevenLabsElevenLabs TTS
CartesiaCartesia TTS
OpenAIOpenAI TTS