Skip to main content

OpenAI (Beta)

OpenAI provides real-time speech-to-text with low latency and reliable performance, making it ideal for conversational AI applications.

Sample configuration

The following example shows a starting asr parameter configuration you can use when you Start a conversational AI agent.


_11
"asr": {
_11
"vendor": "openai",
_11
"params": {
_11
"api_key": "<openai_key>",
_11
"input_audio_transcription": {
_11
"model": "gpt-4o-mini-transcribe",
_11
"prompt": "Please transcribe the following audio into text. Output in English.",
_11
"language": "en"
_11
}
_11
}
_11
}

caution

The parameters listed on this page are validated for use with Conversational AI Engine. To avoid unpredictable behavior, Agora strongly recommends using only the supported parameters. For a complete reference, consult the OpenAI documentation.

Key parameters

asrrequired
  • api_key stringrequired

    The OpenAI API key used to authenticate requests. You must provide a valid key for the service to function.

  • input_audio_transcription objectrequired

    The configuration object for audio transcription. Use this object to specify the model, prompt, and language for the transcription task.

    Show propertiesHide properties
    • model stringrequired

      The OpenAI ASR model to use for transcription. For example, gpt-4o-mini-transcribe.

    • prompt stringrequired

      A prompt that guides the transcription process. Use this parameter to provide context or instructions for how the audio should be transcribed.

    • language stringrequired

      The language code to use for transcription. For example, use en for English.