Microsoft Azure

Microsoft Azure offers neural voices in multiple languages with options for different speaking styles and emotions, providing enterprise-grade text-to-speech capabilities with high-quality audio output.

Sample configuration

The following example shows a starting tts parameter configuration you can use when you Start a conversational AI agent.

"tts": {
  "vendor": "microsoft",
  "params": {
    "key": "<your_microsoft_key>",
    "region": "eastus",
    "voice_name": "en-US-AndrewMultilingualNeural",
    "speed": 1.0,
    "volume": 70,
    "sample_rate": 24000
  }
}

Key parameters

paramsrequired

key stringrequired
The API key used for authentication. Get your API key from the Azure Portal.
region stringrequired
The Azure region where the speech service is hosted (For example, eastus, westus2).
voice_name stringrequired
The identifier for the selected voice for speech synthesis. See available voices for options.
speed numbernullable
Default: 1.0
Speaking rate of the text. Values between 0.5 and 2.0 times the original audio speed.
volume numbernullable
Default: 100
Audio volume as a number between 0.0 and 100.0, where 0.0 is quietest and 100.0 is loudest.
sample_rate integernullable
Default: 24000
Audio sampling rate in Hz. Common values: 16000, 24000, 48000.

For advanced configuration options, voice galleries, and detailed parameter descriptions, see the Microsoft Azure TTS documentation.

Sample configuration​

Key parameters​

Was this helpful?

Sample configuration

Key parameters