Skip to main content

Microsoft Azure

Microsoft Azure offers neural voices in multiple languages with options for different speaking styles and emotions, providing enterprise-grade text-to-speech capabilities with high-quality audio output.

Sample configuration

The following example shows a starting tts parameter configuration you can use when you Start a conversational AI agent.


_11
"tts": {
_11
"vendor": "microsoft",
_11
"params": {
_11
"key": "<your_microsoft_key>",
_11
"region": "eastus",
_11
"voice_name": "en-US-AndrewMultilingualNeural",
_11
"speed": 1.0,
_11
"volume": 70,
_11
"sample_rate": 24000
_11
}
_11
}

Key parameters

paramsrequired
  • key stringrequired

    The API key used for authentication. Get your API key from the Azure Portal.

  • region stringrequired

    The Azure region where the speech service is hosted (For example, eastus, westus2).

  • voice_name stringrequired

    The identifier for the selected voice for speech synthesis. See available voices for options.

  • speed numbernullable

    Default: 1.0

    Speaking rate of the text. Values between 0.5 and 2.0 times the original audio speed.

  • volume numbernullable

    Default: 100

    Audio volume as a number between 0.0 and 100.0, where 0.0 is quietest and 100.0 is loudest.

  • sample_rate integernullable

    Default: 24000

    Audio sampling rate in Hz. Common values: 16000, 24000, 48000.

For advanced configuration options, voice galleries, and detailed parameter descriptions, see the Microsoft Azure TTS documentation.