Skip to main content

Amazon Transcribe (Beta)

Amazon provides advanced automatic speech recognition with high accuracy and support for multiple languages, designed for real-time conversational AI applications.

Sample configuration

The following example shows a starting asr parameter configuration you can use when you Start a conversational AI agent.


_11
"asr": {
_11
"vendor": "amazon",
_11
"params": {
_11
"region": "<AWS_ASR_REGION>",
_11
"access_key_id": "<AWS_ASR_ACCESS_KEY_ID>",
_11
"secret_access_key": "<AWS_ASR_SECRET_ACCESS_KEY>",
_11
"language_code": "en-US",
_11
"media_sample_rate_hz": 16000,
_11
"media_encoding": "pcm"
_11
}
_11
}

caution

The parameters listed on this page are validated for use with Conversational AI Engine. To avoid unpredictable behavior, Agora strongly recommends using only the supported parameters. For a complete reference, consult the Amazon official documentation.

Key parameters

paramsrequired
  • region stringrequired

    The AWS region where the Transcribe service is hosted, for example, us-east-1, us-west-2, or eu-west-1. See AWS regions for available regions.

  • access_key_id stringrequired

    The AWS access key ID used for authentication. Get your access key from the AWS IAM Console.

  • secret_access_key stringrequired

    The AWS secret access key used for authentication. Get your secret key from the AWS IAM Console.

  • language_code stringrequired

    The language code for speech recognition, for example, en-US, es-US, or fr-FR. See supported languages for available language codes.

  • media_sample_rate_hz integernullable

    The sample rate in Hertz for the audio input, for example, 16000 or 8000.

  • media_encoding stringnullable

    The encoding format of the audio input, for example, pcm, opus, or flac.