Optimize audio
In real-time audio interactions, the rhythm, continuity, and intonation of conversations between humans and AI often differ from those between humans. To improve the AI–human conversation experience, it's important to optimize audio settings.
When using the Android or iOS Video/Voice SDK with the Conversational AI Engine, follow the best practices in this guide to improve conversation fluency and reliability, especially in complex network environments.
Server configuration
When calling the server API to create a conversational agent, use the default values for audio-related parameters to ensure the best audio experience.
Client configuration
To configure the client app, implement the following:
Integrate the required dynamic libraries
For the best Conversational AI Engine audio experience, integrate and load the following dynamic libraries in your project:
- Android
- iOS
- AI Denoising Plugin:
libagora_ai_noise_suppression_extension.so
- AI echo cancellation plug-in:
libagora_ai_echo_cancellation_extension.so
For integration details, refer to App size optimization.
- AI Denoising Plugin:
AgoraAiNoiseSuppressionExtension.xcframework
- AI echo cancellation plug-in:
AgoraAiEchoCancellationExtension.xcframework
For integration details, refer to App size optimization.
Set audio parameters
The settings in this section apply to Video/Voice SDK versions 4.3.1 and above. If you are using an earlier version, upgrade to version 4.5.1 or above or Contact Technical Support.
- Android
- iOS
For the best conversational AI audio experience, apply the following settings:
-
Set the audio scenario: When initializing the engine, set the audio scenario to the AI client scenario. You can also set the scenario before joining a channel by calling the
setAudioScenario
method. -
Configure audio parameters: Call
setParameters
before joining a channel and whenever theonAudioRouteChanged
callback is triggered. This configuration sets audio 3A plug-ins (acoustic echo cancellation, noise suppression, and automatic gain control), the audio sampling rate, the audio processing mode, and other settings. For recommended parameter values, refer to the sample code.
Since Video/Voice SDK versions 4.3.1 to 4.5.0 do not support the AI client audio scenario, set the scenario to AUDIO_SCENARIO_CHORUS
to improve the audio experience. However, the audio experience cannot be aligned with versions 4.5.1 and above. To get the best audio experience, upgrade the SDK to version 4.5.1 or higher.
The following sample code defines a setAudioConfigParameters
function to configure audio parameters. Call this function before joining a channel and whenever the audio route changes.
private var rtcEngine: RtcEngineEx? = null
private var mAudioRouting = Constants.AUDIO_ROUTE_DEFAULT
// Set audio configuration parameters
private fun setAudioConfigParameters(routing: Int) {
mAudioRouting = routing
rtcEngine?.apply {
setParameters("{"che.audio.aec.split_srate_for_48k":16000}")
setParameters("{"che.audio.sf.enabled":true}")
setParameters("{"che.audio.sf.stftType":6}")
setParameters("{"che.audio.sf.ainlpLowLatencyFlag":1}")
setParameters("{"che.audio.sf.ainsLowLatencyFlag":1}")
setParameters("{"che.audio.sf.procChainMode":1}")
setParameters("{"che.audio.sf.nlpDynamicMode":1}")
if (routing == Constants.AUDIO_ROUTE_HEADSET // 0
|| routing == Constants.AUDIO_ROUTE_EARPIECE // 1
|| routing == Constants.AUDIO_ROUTE_HEADSETNOMIC // 2
|| routing == Constants.AUDIO_ROUTE_BLUETOOTH_DEVICE_HFP // 5
|| routing == Constants.AUDIO_ROUTE_BLUETOOTH_DEVICE_A2DP) { // 10
setParameters("{"che.audio.sf.nlpAlgRoute":0}")
} else {
setParameters("{"che.audio.sf.nlpAlgRoute":1}")
}
setParameters("{"che.audio.sf.ainlpModelPref":10}")
setParameters("{"che.audio.sf.nsngAlgRoute":12}")
setParameters("{"che.audio.sf.ainsModelPref":10}")
setParameters("{"che.audio.sf.nsngPredefAgg":11}")
setParameters("{"che.audio.agc.enable":false}")
}
}
// Create and initialize the RTC engine
fun createRtcEngine(rtcCallback: IRtcEngineEventHandler): RtcEngineEx {
val config = RtcEngineConfig()
config.mContext = AgentApp.instance()
config.mAppId = ServerConfig.rtcAppId
config.mChannelProfile = Constants.CHANNEL_PROFILE_LIVE_BROADCASTING
// Set the audio scene to AI dialogue scene (supported by 4.5.1 and above)
// Version 4.3.1 ~ 4.5.0 is set to chorus scene AUDIO_SCENARIO_CHORUS
config.mAudioScenario = Constants.AUDIO_SCENARIO_AI_CLIENT
// Register audio route change callback
config.mEventHandler = object : IRtcEngineEventHandler() {
override fun onAudioRouteChanged(routing: Int) {
super.onAudioRouteChanged(routing)
// Set audio related parameters
setAudioConfigParameters(routing)
}
}
try {
rtcEngine = (RtcEngine.create(config) as RtcEngineEx).apply {
// Load the audio plugin
loadExtensionProvider("ai_echo_cancellation_extension")
loadExtensionProvider("ai_noise_suppression_extension")
}
} catch (e: Exception) {
Log.e("CovAgoraManager", "createRtcEngine error: $e")
}
return rtcEngine!!
}
// Join the channel
fun joinChannel(rtcToken: String, channelName: String, uid: Int, isIndependent: Boolean = false) {
// Initialize audio configuration parameters
setAudioConfigParameters(mAudioRouting)
// Configure channel options and join the channel
val options = ChannelMediaOptions()
options.clientRoleType = CLIENT_ROLE_BROADCASTER
options.publishMicrophoneTrack = true
options.publishCameraTrack = false
options.autoSubscribeAudio = true
options.autoSubscribeVideo = false
val ret = rtcEngine?.joinChannel(rtcToken, channelName, uid, options)
}
For the best conversational AI audio experience, apply the following settings:
-
Set the audio scenario: When initializing the engine, set the audio scenario to the AI client scenario. You can also set the scenario before joining a channel by calling the
setAudioScenario
method. -
Configure audio parameters: Call
setParameters
before joining a channel and whenever thertcEngine:didAudioRouteChanged:
callback is triggered. This configuration sets audio 3A plug-ins (acoustic echo cancellation, noise suppression, and automatic gain control), the audio sampling rate, the audio processing mode, and other settings. For recommended parameter values, refer to the sample code.
Since Video/Voice SDK versions 4.3.1 to 4.5.0 do not support the AI client audio scenario, set the scenario to AgoraAudioScenarioChorus
to improve the audio experience. However, the audio experience cannot be aligned with versions 4.5.1 and above. To get the best audio experience, upgrade the SDK to version 4.5.1 or higher.
The following sample code defines a setAudioConfigParameters
function to configure audio parameters. Call this function before joining a channel and whenever the audio route changes.
class RTCManager: NSObject {
private var rtcEngine: AgoraRtcEngineKit!
private var audioDumpEnabled: Bool = false
private var audioRouting = AgoraAudioOutputRouting.default
// Set audio related parameters
private func setAudioConfigParameters(routing: AgoraAudioOutputRouting) {
audioRouting = routing
rtcEngine.setParameters("{"che.audio.aec.split_srate_for_48k":16000}")
rtcEngine.setParameters("{"che.audio.sf.enabled":true}")
rtcEngine.setParameters("{"che.audio.sf.stftType":6}")
rtcEngine.setParameters("{"che.audio.sf.ainlpLowLatencyFlag":1}")
rtcEngine.setParameters("{"che.audio.sf.ainsLowLatencyFlag":1}")
rtcEngine.setParameters("{"che.audio.sf.procChainMode":1}")
rtcEngine.setParameters("{"che.audio.sf.nlpDynamicMode":1}")
if routing == .headset ||
routing == .earpiece ||
routing == .headsetNoMic ||
routing == .bluetoothDeviceHfp ||
routing == .bluetoothDeviceA2dp {
rtcEngine.setParameters("{"che.audio.sf.nlpAlgRoute":0}")
} else {
rtcEngine.setParameters("{"che.audio.sf.nlpAlgRoute":1}")
}
rtcEngine.setParameters("{"che.audio.sf.ainlpModelPref":10}")
rtcEngine.setParameters("{"che.audio.sf.nsngAlgRoute":12}")
rtcEngine.setParameters("{"che.audio.sf.ainsModelPref":10}")
rtcEngine.setParameters("{"che.audio.sf.nsngPredefAgg":11}")
rtcEngine.setParameters("{"che.audio.agc.enable":false}")
}
}
extension RTCManager: RTCManagerProtocol {
func createRtcEngine(delegate: AgoraRtcEngineDelegate) -> AgoraRtcEngineKit {
let config = AgoraRtcEngineConfig()
config.appId = AppContext.shared.appId
config.channelProfile = .liveBroadcasting
// Set the audio scene to AI dialogue scene (supported by 4.5.1 and above)
// Versions 4.3.1 ~ 4.5.0 support chorus scenes .chorus
config.audioScenario = .aiClient
rtcEngine = AgoraRtcEngineKit.sharedEngine(with: config, delegate: delegate)
// Register audio route change callback
rtcEngine.addDelegate(self)
return rtcEngine
}
func joinChannel(rtcToken: String, channelName: String, uid: String) {
// Initialize audio configuration parameters
setAudioConfigParameters(routing: audioRouting)
// Configure channel options and join the channel
let options = AgoraRtcChannelMediaOptions()
options.clientRoleType = .broadcaster
options.publishMicrophoneTrack = true
options.publishCameraTrack = false
options.autoSubscribeAudio = true
options.autoSubscribeVideo = false
let ret = rtcEngine.joinChannel(byToken: rtcToken, channelId: channelName, uid: UInt(uid) ?? 0, mediaOptions: options)
}
}
// Implement the AgoraRtcEngineDelegate interface to handle audio route change callbacks
extension RTCManager: AgoraRtcEngineDelegate {
public func rtcEngine(_ engine: AgoraRtcEngineKit, didAudioRouteChanged routing: AgoraAudioOutputRouting) {
setAudioConfigParameters(routing: routing)
}
}
Reference
This section contains content that completes the information on this page, or points you to documentation that explains other aspects to this product.
Sample project
Refer to the following open-source sample code to set audio-related parameters.
- Android
- iOS