Interrupt agent
When interacting with an agent, you may need to interrupt the agent to begin a new round of conversation. The Agora Conversational AI Engine supports agent interruption in the following ways:
- Voice interruption: The engine detects user voice input and automatically stops the agent’s response.
- Manual interruption: Your app can explicitly stop the agent by calling a REST API or client SDK method—typically triggered by a button tap or custom command.
This page describes how to implement agent interruption in your app.
Voice interruption
Conversational AI Engine supports a graceful interruption feature that allows a user’s voice input to automatically interrupt the speaking agent. This enables quicker response times and more natural, fluid interactions.
Graceful interruption is a value-added service. A separate fees is charged after the service is enabled. For details, see Pricing.
By default, this feature is disabled. To enable it, set advanced_features.enable_aivad
to true
when calling Start a conversational AI agent. The following example shows how to do this:
If the request succeeds, the API returns a 200
status code and a response body like the following:
Manual interruption
Conversational AI Engine supports actively triggering an interruption by calling RESTful APIs or client component APIs. This allows users to interrupt the agent through a button click or a specific command.
Call the RESTful API
Use the Interrupt agent API to manually initiate an interruption request.
If the request is successful, the API returns a 200 status code and the following response body:
Call the client toolkit API
Agora provides a set of flexible, scalable and standardized client components for its conversational AI engine. These components support iOS, Android, and Web platforms and encapsulate scenario-based APIs. You can use them to integrate Agora Real-Time Communication (RTC) and Real-Time Messaging Signaling capabilities, enabling the following features:
- Interrupt the agent
- Display real-time subtitles
- Receive event notifications
- Optimize audio (Android and iOS only)
Before you begin, make sure you:
- Integrate Video SDK v4.5.1 or later and follow the Quickstart guide to implement basic real-time audio and video features.
- Enable the Signaling service for your project in the Agora Console and follow the Signaling Quickstart to implement real-time messaging.
- Implement the basic logic to communicate with a Conversational AI agent.
- Ensure that the RTC engine instance is initialized and Signaling is logged in. The toolkit does not handle initialization, lifecycle management, authentication, or login for Video SDK or Signaling.
Integrate the toolkit
- Android
- iOS
- Web
Copy the convoaiApi
folder to your project and import it before calling API methods. Refer to the component structure to understand the role of each file.
Copy the ConversationalAIAPI
folder to your project and import it before calling API methods. Refer to the component structure to understand the role of each file.
Copy the conversational-ai-api
file to your project and import it before calling API methods.. Refer to the component structure to understand the role of each file.
Initialize the component
Create a configuration object for the RTC engine and Signaling client instances, then use it to initialize the component instance.
- Android
- iOS
- Web
// Create a configuration object for the RTC and RTM instances
val config = ConversationalAIAPIConfig(
rtcEngine = rtcEngineInstance,
rtmClient = rtmClientInstance,
enableLog = true
)
// Create the component instance
val api = ConversationalAIAPIImpl(config)
// Create a configuration object for the RTC and RTM instances
let config = ConversationalAIAPIConfig(
rtcEngine: rtcEngine,
rtmEngine: rtmEngine,
enableLog: true
)
// Create the component instance
convoAIAPI = ConversationalAIAPIImpl(config: config)
// Create a configuration object for the RTC and RTM instances
ConversationalAIAPI.init({
rtcEngine,
rtmEngine,
})
// Get the API instance (singleton)
const conversationalAIAPI = ConversationalAIAPI.getInstance()
Configure the conversational AI agent
Call Start a conversational AI agent using the following parameter settings:
advanced_features.enable_rtm: true
: Start the Signaling service (Required)parameters.data_channel: "rtm"
: Enable the RTM data transmission channel (Required)parameters.enable_metrics: true
: Receive agent performance data (Enabled on demand)parameters.enable_error_message: true
: Receive agent error events (Enable on demand)
After the call is successful, the agent joins the specified RTC channel and the user can start interacting with the agent.
Interrupt the agent
Call the interrupt
method to interrupt the agent.
- Android
- iOS
- Web
api.interrupt("agentId") { error -> /* ... */ }
convoAIAPI.interrupt(agentUserId: "(agentUid)") { error in
if let error = error {
print("Interruption failed: (error.message)")
} else {
print("Interruption succeeded")
}
}
await conversationalAIAPI.interrupt(`${agent_rtc_uid}`)
Destroy the component
When the agent interaction ends, destroy the component instance to release all resources.
- Android
- iOS
- Web
api.destroy()
convoAIAPI.destroy()
conversationalAIAPI.destroy()
Reference
Sample project
Agora provides a sample project for your reference. Download or view the source code for a complete example.
Component structure
The structure of the client component folder and the functions of each file are as follows:
Copy only the following files and folders to integrate the client component. You do not need to copy other files.
- Android
- iOS
- Web
IConversationalAIAPI.kt
: API interface and related data structures and enumerationsConversationalAIAPIImpl.kt
: ConversationalAI API main implementation logicConversationalAIUtils.kt
: Tool functions and event callback management subRender/v3/
: Subtitle moduleTranscriptionController.kt
: Subtitle controllerMessageParser.kt
: Message parser
ConversationalAIAPI.swift
: API interface and related data structures and enumerationsConversationalAIAPIImpl.swift
: ConversationalAI API main implementation logic- Transcription/
TranscriptionController.swift
: Subtitle controller
index.ts
: API Classtype.ts
: API interface and related data structures and enumerationsutils/index.ts
: API utility functionsutils/events.ts
: Event management class, which can be extended to easily implement event monitoring and broadcastingutils/sub-render.ts
: Subtitle module