iOS toolkit API
The iOS toolkit API provides the following classes and methods.
ConversationalAIAPI class
| API | Description |
|---|---|
chat | Send a message to the agent. |
addHandler | Add an event handler to receive the callback. |
removeHandler | Removes an event handler. |
subscribeMessage | Subscribe to channel messages. |
unsubscribeMessage | Unsubscribe from channel messages. |
interrupt | Interrupt the agent's current speech or task processing. |
loadAudioSettings[1/2] | Set audio best practice parameters for optimal performance. |
loadAudioSettings[2/2] | Set audio best practice parameters for specific scenarios. |
destroy | Destroys the API instance and releases all resources. |
chat
Sends a message to the agent for processing.
You can use this method to send text and image messages to the agent, and the completion callback indicates the success or failure of the operation.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | The RTM user ID of the agent. Must be globally unique. |
message | ChatMessage | Message object containing the image URL. See ChatMessage for details. |
completion | @escaping (ConversationalAIAPIError?) -> Void | Callback function invoked when the operation completes.
|
addHandler
Add an event handler to receive the callback.
You can register a delegate to receive session events, state changes, and other notifications.
| Parameter | Type | Description |
|---|---|---|
handler | ConversationalAIAPIEventHandler | Event handler to implement the ConversationalAIAPIEventHandler protocol. See ConversationalAIAPIEventHandler. |
removeHandler
Removes an event handler.
| Parameter | Type | Description |
|---|---|---|
handler | ConversationalAIAPIEventHandler | The event handler to remove. See ConversationalAIAPIEventHandler. |
subscribeMessage
Subscribe to channel messages.
Set channel parameters and register message subscription callbacks. This method is called when the channel changes, usually when the agent starts.
| Parameter | Type | Description |
|---|---|---|
channelName | String | The name of the channel to subscribe to. |
completion | (ConversationalAIAPIError?) -> Void | A callback that returns error information when subscription fails. See ConversationalAIAPIError. |
unsubscribeMessage
Unsubscribe from channel messages.
Calling this method can stop receiving messages from the specified channel, which is suitable for scenarios where the connection with the agent is disconnected.
| Parameter | Type | Description |
|---|---|---|
channelName | String | The name of the channel to unsubscribe from. |
completion | (ConversationalAIAPIError?) -> Void | The callback after the unsubscription operation is completed. You can get error information through this callback. See ConversationalAIAPIError. |
interrupt
Interrupt the agent's current ongoing speech or task processing.
This method can be used to interrupt an agent that is currently speaking or processing a task.
If error has a value, it means the message sending failed. If error is nil, it means the message was sent successfully, but it does not guarantee that the agent was successfully interrupted.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | The RTM user ID of the agent, must be globally unique. |
completion | (ConversationalAIAPIError?) -> Void | The callback function when the operation is completed. You can get the result or error information of the operation through the parameters of the callback. See ConversationalAIAPIError. |
loadAudioSettings[1/2]
Set audio best practice parameters for optimal performance.
Sets the audio parameters needed for optimal performance in agent conversations. By default, .aiClient Audio Scene is used.
To enable audio best practices, you must call this method before each joinChannel call.
Sample code:
loadAudioSettings[2/2]
Set audio best practice parameters for specific scenarios.
This method allows you to configure the audio parameters required for optimal performance in your agent conversations.
If you need to enable audio best practices, you must call this method before each joinChannel call.
| Parameter | Type | Description |
|---|---|---|
scenario | AgoraAudioScenario | The audio scenario. See AgoraAudioScenario. If you enable the AI Avatar feature, set the scenario to default for better mixing results. |
destroy
Destroys the API instance and releases all resources.
Calling this method destroys the current API instance and releases all resources. After calling this method, the instance cannot be used again. Please call this method when you no longer need to use the API.
ConversationalAIAPIEventHandler class
| API | Description |
|---|---|
onMessageError | A callback triggered when an error occurs during message processing. |
onMessageReceiptUpdated | Image message information update callback. |
onAgentStateChanged | Callback when the agent status changes. |
onAgentInterrupted | The callback triggered when an interrupt event occurs. |
onAgentMetrics | A callback that is triggered when performance metrics are available. |
onAgentError | Callback when an error occurs in the AI module. |
onTranscriptUpdated | Transcription content update callback. |
onMessageError
This callback is triggered when an error occurs during message processing. For example, if sending a chat message fails, an error message is returned.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | RTM user ID of the agent. |
error | MessageError | MessageError object containing the error type and details. |
onMessageReceiptUpdated
Image message information update callback.
This callback is triggered when an image is processed in a session and provides image metadata.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | RTM user ID of the agent. |
messageReceipt | MessageReceipt | Message receipt containing the type, module, and image information. See MessageReceipt for more details. |
onAgentStateChanged
Callback when the agent status changes.
This callback is triggered when the agent state changes, such as switching from idle to silent, listening, thinking, or speaking. You can use this callback to update the user interface or track the flow of the conversation.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | The RTM user ID of the agent. |
event | StateChangeEvent | Agent status change event, including status, round ID, timestamp and reason. See StateChangeEvent. |
onAgentInterrupted
The callback triggered when an interrupt event occurs.
This callback may not be synchronized with the agent's state. It is not recommended to process business logic in this callback.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | The RTM user ID of the agent. |
event | InterruptEvent | Interrupt event, including round ID and timestamp. See InterruptEvent. |
onAgentMetrics
A callback that is triggered when performance metrics are available.
This callback provides performance data, such as LLM inference latency and TTS speech synthesis latency, for monitoring system performance.
This performance indicator callback is not necessarily synchronized with the agent's state, so it is not recommended to process business logic in this callback.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | The RTM user ID of the agent. |
metrics | Metric | Performance indicator, including type, value and timestamp. See Metric. |
onAgentError
Callback when an error occurs in the AI module.
This callback is called when an error occurs in a module component (such as LLM, TTS, etc.). It can be used for error monitoring, logging, and implementing service degradation strategies.
This callback is not necessarily synchronized with the state of the agent, so it is not recommended to process business logic in this callback.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | The RTM user ID of the agent. |
error | ModuleError | Module error, including type, error code, error message and timestamp. See ModuleError. |
onTranscriptUpdated
Transcript content update callback.
This callback is triggered when the speech transcript content in the session is updated.
| Parameter | Type | Description |
|---|---|---|
agentUserId | String | The RTM user ID of the agent. |
transcript | Transcript | Transcript data, including text content, status and metadata. See Transcript. |
Structures
ChatMessage
Defines a common interface for different types of chat messages.
| Parameter | Type | Description |
|---|---|---|
messageType | ChatMessageType | Message type. See ChatMessageType for details. |
ImageMessage
Used to send image content to an agent.
Images are displayed as URLs, which are HTTP/HTTPS links pointing to the image file (recommended for large images). Example usage: ImageMessage(uuid = "img_123", url = "https://example.com/image.jpg")
| Parameter | Type | Description |
|---|---|---|
messageType | ChatMessageType | Type of the message. See ChatMessageType for more details. |
uuid | String | Unique identifier for the image message. The agent uses this UUID to identify the image. |
url | String? | HTTP/HTTPS link to the image file which the proxy will use to download and process the image. |
MessageReceipt
The MessageReceipt model is used to track message processing status and metadata.
| Parameter | Type | Description |
|---|---|---|
moduleType | ModuleType | Module type. See ModuleType for more details. |
messageType | ChatMessageType | Type of the message. See ChatMessageType for more details. |
message | String | Image information. |
turnId | Int | Conversation turn ID, used to identify a specific turn in the session. |
MessageError
A data class for processing and reporting message errors.
Contains the error type, error code, error description, and timestamp.
| Parameter | Type | Description |
|---|---|---|
type | ChatMessageType | Message error type. See ChatMessageType. |
code | Int | Error code to identify specific error conditions. |
message | String | Error description, providing detailed error explanation, usually a JSON string containing resource information. |
timestamp | TimeInterval | Timestamp of the error (milliseconds since January 1, 1970, UTC). |
TextMessage
Used to send natural language text content to an agent.
Supports priority control and interruptibility settings, enabling fine-grained control over how the AI processes and responds to text input.
| Parameter | Type | Description |
|---|---|---|
messageType | ChatMessageType | Message type. See ChatMessageType. |
priority | Priority | Message handling priority. See Priority. |
responseInterruptable | Bool | Whether the response to this message can be interrupted by a higher-priority message:YES (default): Response can be interrupted.NO: Response cannot be interrupted. |
text | String? | Text content of the message. |
StateChangeEvent
Indicates an agent state change event, including complete state information and timestamp.
Used to track session flow and update UI status indicators.
| Parameter | Type | Description |
|---|---|---|
state | AgentState | Current agent state. See AgentState. |
turnId | Int | Session round ID, used to identify a specific session round. |
timestamp | TimeInterval | Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC). |
reason | String | The reason for the status change. |
InterruptEvent
Indicates a session interruption event.
It is usually triggered when the user actively interrupts the AI speech or the system detects a high-priority message. It is used to record the interruption behavior and perform corresponding processing.
| Parameter | Type | Description |
|---|---|---|
turnId | Int | The ID of the interrupted session round. |
timestamp | TimeInterval | Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC). |
Metric
Used to record and transmit system performance data.
For example, LLM inference delay, TTS synthesis delay, etc. This data can be used for performance monitoring, system optimization, and user experience improvement.
| Parameter | Type | Description |
|---|---|---|
type | ModuleType | Indicator type. See ModuleType. |
name | String | The indicator name of a specific performance item. |
value | Double | Metric value, usually latency (milliseconds) or other quantitative metrics. |
timestamp | TimeInterval | Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC). |
ModuleError
Used to process and report AI module related error information.
Contains error type, error code, error description, and timestamp to facilitate error monitoring, logging, and troubleshooting.
| Parameter | Type | Description |
|---|---|---|
type | ModuleType | Error type. See ModuleType. |
code | Int | Error codes that identify specific error conditions. |
message | String | Provides an error description that details the error. |
timestamp | TimeInterval | Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC). |
Transcript
Used to represent a user-visible transcript message.
Complete data class for rendering transcripts at the UI level.
| Parameter | Type | Description |
|---|---|---|
turnId | Int | Unique identifier for the session turn. |
userId | String | The user identifier associated with this transcript. |
text | String | The actual transcript text content. |
status | TranscriptStatus | The current status of the transcription. See TranscriptStatus. |
type | TranscriptType | The current type of transcript (agent or user). See TranscriptType. |
ConversationalAIAPIConfig
Conversational AI API initialization configuration class.
Contains the configuration parameters required for Conversational AI API initialization, including rtcEngine for audio and video communication, rtmEngine for message communication, and transcript rendering mode settings.
| Parameter | Type | Description |
|---|---|---|
rtcEngine | AgoraRtcEngineKit? | Engine instance used for audio and video communication. See AgoraRtcEngineKit. |
rtmEngine | AgoraRtmClientKit? | Client instance for real-time messaging. See AgoraRtmClientKit. |
renderMode | TranscriptRenderMode | Transcript rendering mode. See TranscriptRenderMode. |
enableLog | Bool | Whether to enable verbose logging:
|
ConversationalAIAPIError
Class for logging and communicating error information.
| Property | Type | Description |
|---|---|---|
type | ConversationalAIAPIErrorType | Error type. See ConversationalAIAPIErrorType. |
code | Int | Error codes that identify specific error conditions. |
message | String | Provides an error description that details the error. |
Enum classes
ChatMessageType
Message type enumeration.
Used to distinguish different message types within a session.
| Value | Description |
|---|---|
text | (0): Text message type. |
image | (1): Image message type. |
unknown | (2): Unknown message type. |
Priority
Controls the priority with which the agent processes messages.
| Value | Description |
|---|---|
interrupt | (0): High priority. The agent will immediately interrupt the current interaction and process the message. Suitable for urgent or time-sensitive messages. |
append | (1): Medium priority. The agent will queue the message after the current interaction is completed, suitable for subsequent questions. |
ignore | (2): Low priority. If the agent is currently interacting, the message will be discarded and only processed when idle, suitable for optional content. |
AgentState
Represents the different states of the agent during the dialogue.
| Value | Description |
|---|---|
idle | (0): Idle state, the agent is not actively processing. |
silent | (1): Silent state, the agent remains silent but is ready to listen. |
listening | (2): Listening state, the agent is actively listening to user input. |
thinking | (3): Thinking state, the agent is processing user input and generating responses. |
speaking | (4): Speaking state, the agent is speaking or outputting audio content. |
unknown | (5): Unknown state, used for fallback processing of unrecognized states. |
ModuleType
Represents different types of AI modules for performance monitoring.
| Value | Description |
|---|---|
llm | (0): Reasoning with large language models. |
mllm | (1): Multimodal large language model reasoning. |
tts | (2): Text-to-speech synthesis. |
context | (3): Context module type. |
unknown | (4): Unknown module type. |
MessageType
Used to distinguish different types of messages in the session system.
| Value | Description |
|---|---|
metrics | The indicator message type. |
error | The error message type. |
assistant | AI agent transcript message type. |
user | User transcript message type. |
interrupt | The interrupt message type. |
state | Status message type. |
imageInfo | Image information message type. |
unknown | Unknown message type. |
TranscriptRenderMode
Transcript rendering mode.
| Value | Description |
|---|---|
words | (0): Word-by-word transcript rendering mode, updated every time a word is processed. |
text | (1): Sentence-by-sentence transcript rendering mode, updated when the complete sentence is ready. |
TranscriptType
Distinguish the source type of the transcribed text.
By identifying whether the transcribed text comes from an agent or a user, it helps manage the conversation flow and interface presentation.
| Value | Description |
|---|---|
agent | Transcript generated by the agent. Contains the responses and statements of the agent assistant, and is used to render the agent's voice in the conversational interface. |
user | Transcribed text from the user. Contains the text converted from the user's voice input, which is used to display the user's voice content in the conversation process. |
TranscriptStatus
Indicates the current state of the transcription in the conversation flow.
Used to track and manage the lifecycle status of transcripts.
| Value | Description |
|---|---|
inprogress | (0): Transcription in progress. This state is set while the transcript is being generated or played, indicating that the content is still being processed or streamed. |
end | (1): Transcription completed. This status is set when text generation ends normally, indicating the natural end of the transcript segment. |
interrupted | (2): Transcription interrupted. This state is set when text generation is stopped prematurely, which is applicable when it is interrupted by a higher priority message. |
ConversationalAIAPIErrorType
Used to distinguish different error types in conversational agent systems.
| Value | Description |
|---|---|
unknown | (0): Unknown error type. |
rtcError | (2): RTC (real-time communication) related errors. |
rtmError | (3): RTM (Real Time Messaging) related errors. |