iOS toolkit API

The iOS toolkit API provides the following classes and methods.

ConversationalAIAPI class

API	Description
`chat`	Send a message to the agent.
`addHandler`	Add an event handler to receive the callback.
`removeHandler`	Removes an event handler.
`subscribeMessage`	Subscribe to channel messages.
`unsubscribeMessage`	Unsubscribe from channel messages.
`interrupt`	Interrupt the agent's current speech or task processing.
`loadAudioSettings`[1/2]	Set audio best practice parameters for optimal performance.
`loadAudioSettings`[2/2]	Set audio best practice parameters for specific scenarios.
`destroy`	Destroys the API instance and releases all resources.

`chat`

Sends a message to the agent for processing.

You can use this method to send text and image messages to the agent, and the completion callback indicates the success or failure of the operation.

@objc func chat(agentUserId: String, message: ChatMessage, completion: @escaping (ConversationalAIAPIError?) -> Void)

Parameter	Type	Description
`agentUserId`	`String`	The RTM user ID of the agent. Must be globally unique.
`message`	`ChatMessage`	Message object containing the image URL. See `ChatMessage` for details.
`completion`	`@escaping (ConversationalAIAPIError?) -> Void`	Callback function invoked when the operation completes. Returns nil on success. Returns error information on failure, see ConversationalAIAPIError for details.

`addHandler`

Add an event handler to receive the callback.

You can register a delegate to receive session events, state changes, and other notifications.

@objc func addHandler(handler: ConversationalAIAPIEventHandler)

Parameter	Type	Description
`handler`	`ConversationalAIAPIEventHandler`	Event handler, needs to implement `ConversationalAIAPIEventHandler` protocol. See `ConversationalAIAPIEventHandler`.

`removeHandler`

Removes an event handler.

@objc func removeHandler(handler: ConversationalAIAPIEventHandler)

Parameter	Type	Description
`handler`	`ConversationalAIAPIEventHandler`	The event handler to remove. See `ConversationalAIAPIEventHandler`.

`subscribeMessage`

Subscribe to channel messages.

Set channel parameters and register message subscription callbacks. This method is called when the channel changes, usually when the agent starts.

@objc func subscribeMessage(channelName: String, completion: @escaping (ConversationalAIAPIError?) -> Void)

Parameter	Type	Description
`channelName`	`String`	The name of the channel to subscribe to.
`completion`	`(ConversationalAIAPIError?) -> Void`	A callback that returns error information when subscription fails. See `ConversationalAIAPIError`.

`unsubscribeMessage`

Unsubscribe from channel messages.

Calling this method can stop receiving messages from the specified channel, which is suitable for scenarios where the connection with the agent is disconnected.

@objc func unsubscribeMessage(channelName: String, completion: @escaping (ConversationalAIAPIError?) -> Void)

Parameter	Type	Description
`channelName`	`String`	The name of the channel to unsubscribe from.
`completion`	`(ConversationalAIAPIError?) -> Void`	The callback after the unsubscription operation is completed. You can get error information through this callback. See `ConversationalAIAPIError`.

`interrupt`

Interrupt the agent's current speech or task processing.

This method can be used to interrupt an agent that is currently speaking or processing a task.

@objc func interrupt(agentUserId: String, completion: @escaping (ConversationalAIAPIError?) -> Void)

info

If error has a value, it means the message sending failed. If error is nil, it means the message was sent successfully, but it does not guarantee that the agent was successfully interrupted.

Parameter	Type	Description
`agentUserId`	`String`	The RTM user ID of the agent, must be globally unique.
`completion`	`(ConversationalAIAPIError?) -> Void`	The callback function when the operation is completed. You can get the result or error information of the operation through the parameters of the callback. See `ConversationalAIAPIError`.

`loadAudioSettings[1/2]`

Set audio best practice parameters for optimal performance.

Sets the audio parameters needed for optimal performance in agent conversations. By default, .aiClient Audio Scene is used.

@objc func loadAudioSettings()

info

To enable audio best practices, you must call this method before each joinChannel call.

Sample code:

// Set audio best practice parameters before joining the channel
api.loadAudioSettings()  // Use default scenario
// Then join the channel
rtcEngine.joinChannel(byToken: token, channelId: channelName, info: nil, uid: userId)

`loadAudioSettings[2/2]`

Set audio best practice parameters for specific scenarios.

This method allows you to configure the audio parameters required for optimal performance in your agent conversations.

@objc func loadAudioSettings(scenario: AgoraAudioScenario)

info

If you need to enable audio best practices, you must call this method before each joinChannel call.

Parameter	Type	Description
`scenario`	`AgoraAudioScenario`	Audio scenario for optimal performance.

`destroy`

Destroys the API instance and releases all resources.

Calling this method destroys the current API instance and releases all resources. After calling this method, the instance cannot be used again. Please call this method when you no longer need to use the API.

@objc func destroy()

ConversationalAIAPIEventHandler class

API	Description
`onMessageError`	A callback triggered when an error occurs during message processing.
`onMessageReceiptUpdated`	Image message information update callback.
`onAgentStateChanged`	Callback when the agent status changes.
`onAgentInterrupted`	The callback triggered when an interrupt event occurs.
`onAgentMetrics`	A callback that is triggered when performance metrics are available.
`onAgentError`	Callback when an error occurs in the AI module.
`onTranscriptionUpdated`	Transcription content update callback.

`onMessageError`

This callback is triggered when an error occurs during message processing. For example, if sending a chat message fails, an error message is returned.

@objc func onMessageError(agentUserId: String, error: MessageError)

Parameter	Type	Description
`agentUserId`	`String`	RTM user ID of the agent.
`error`	`MessageError`	Message error object containing the error type and details.

`onMessageReceiptUpdated`

Image message information update callback.

This callback is triggered when an image is processed in a session and provides image metadata.

@objc func onMessageReceiptUpdated(agentUserId: String, messageReceipt: MessageReceipt)

Parameter	Type	Description
`agentUserId`	`String`	RTM user ID of the agent.
`messageReceipt`	`MessageReceipt`	Message receipt containing the type, module, and image information. See `MessageReceipt` for more details.

`onAgentStateChanged`

Callback when the agent status changes.

This callback is triggered when the agent state changes, such as switching from idle to silent, listening, thinking, or speaking. You can use this callback to update the user interface or track the flow of the conversation.

@objc func onAgentStateChanged(agentUserId: String, event: StateChangeEvent)

Parameter	Type	Description
`agentUserId`	`String`	The RTM user ID of the agent.
`event`	`StateChangeEvent`	Agent status change event, including status, round ID, timestamp and reason. See `StateChangeEvent`.

`onAgentInterrupted`

The callback triggered when an interrupt event occurs.

@objc func onAgentInterrupted(agentUserId: String, event: InterruptEvent)

info

This callback is not necessarily synchronized with the agent's state, so it is not recommended to process business logic in this callback.

Parameter	Type	Description
`agentUserId`	`String`	The RTM user ID of the agent.
`event`	`InterruptEvent`	Interrupt event, including round ID and timestamp. See `InterruptEvent`.

`onAgentMetrics`

A callback that is triggered when performance metrics are available.

This callback provides performance data, such as LLM inference latency and TTS speech synthesis latency, for monitoring system performance.

@objc func onAgentMetrics(agentUserId: String, metrics: Metric)

info

This performance indicator callback is not necessarily synchronized with the agent's state, so it is not recommended to process business logic in this callback.

Parameter	Type	Description
`agentUserId`	`String`	The RTM user ID of the agent.
`metrics`	`Metric`	Performance indicator, including type, value and timestamp. See `Metric`.

`onAgentError`

Callback when an error occurs in the AI module.

This callback is called when an error occurs in a module component (such as LLM, TTS, etc.). It can be used for error monitoring, logging, and implementing service degradation strategies.

@objc func onAgentError(agentUserId: String, error: ModuleError)

info

This callback is not necessarily synchronized with the state of the agent, so it is not recommended to process business logic in this callback.

Parameter	Type	Description
`agentUserId`	`String`	The RTM user ID of the agent.
`error`	`ModuleError`	Module error, including type, error code, error message and timestamp. See `ModuleError`.

`onTranscriptionUpdated`

Transcription content update callback.

This callback is triggered when the speech transcription content in the session is updated.

@objc func onTranscriptionUpdated(agentUserId: String, transcription: Transcription)

Parameter	Type	Description
`agentUserId`	`String`	The RTM user ID of the agent.
`transcription`	`Transcription`	Transcript data, including text content, status and metadata. See `Transcription`.

Structures

`ChatMessage`

Defines a common interface for different types of chat messages.

@objc public protocol ChatMessage {
    var messageType: ChatMessageType { get }
}

Parameter	Type	Description
`messageType`	`ChatMessageType`	Message type. See `ChatMessageType` for details.

`ImageMessage`

Used to send image content to an agent.

The following image formats are supported:

url: An HTTP/HTTPS link to the image file (recommended for large images).
base64: Base64-encoded image data.

info
When using base64, ensure the entire message size (including the JSON structure) is less than 32KB to comply with the RTM message channel limit. For larger images, use a URL.

Example usage:

Image URL: ImageMessage(uuid = "img_123", url = "https://example.com/image.jpg")
Image Base64: ImageMessage(uuid = "img_456", base64 = "data:image/jpeg;base64,...")

@objc public class ImageMessage: NSObject, ChatMessage {
    @objc public let messageType: ChatMessageType = .image
    @objc public let uuid: String
    @objc public let url: String?
    @objc public let base64: String?
}

Parameter	Type	Description
`messageType`	`ChatMessageType`	Type of the message. See `ChatMessageType` for more details.
`uuid`	`String`	Unique identifier for the image message. The agent uses this UUID to identify the image.
`url`	`String?`	HTTP/HTTPS link to the image file. The agent will download and process the image from this link. info Mutually exclusive with `base64`.
`base64`	`String?`	Base64-encoded image data. When using this parameter, the total message size is limited to 32KB. info Mutually exclusive with `url`; total message size must not exceed 32KB.

`MessageReceipt`

The MessageReceipt model is used to track message processing status and metadata.

@objc public class MessageReceipt: NSObject {
    @objc public let moduleType: ModuleType
    @objc public let messageType: ChatMessageType
    @objc public let message: String
    @objc public let turnId: Int
}

Parameter	Type	Description
`moduleType`	`ModuleType`	Module type. See `ModuleType` for more details.
`messageType`	`ChatMessageType`	Type of the message. See `ChatMessageType` for more details.
`message`	`String`	Image information.
`turnId`	`Int`	Conversation turn ID, used to identify a specific turn in the session.

`MessageError`

A data class for processing and reporting message errors.

Contains the error type, error code, error description, and timestamp.

@objc public class MessageError: NSObject {
    @objc public let type: ChatMessageType
    @objc public let code: Int
    @objc public let message: String
    @objc public let timestamp: TimeInterval
    @objc public init(type: ChatMessageType, code: Int, message: String, timestamp: TimeInterval)
}

Parameter	Type	Description
`type`	`ChatMessageType`	Message error type. See ChatMessageType.
`code`	`Int`	Error code to identify specific error conditions.
`message`	`String`	Error description, providing detailed error explanation, usually a JSON string containing resource information.
`timestamp`	`TimeInterval`	Timestamp of the error (milliseconds since January 1, 1970, UTC).

`TextMessage`

Used to send natural language text content to an agent.

Supports priority control and interruptibility settings, enabling fine-grained control over how the AI processes and responds to text input.

@objc public class TextMessage: NSObject, ChatMessage {
    @objc public let messageType: ChatMessageType = .text
    @objc public let priority: Priority
    @objc public let responseInterruptable: Bool
    @objc public let text: String?
}

Parameter	Type	Description
`messageType`	`ChatMessageType`	Message type. See `ChatMessageType`.
`priority`	`Priority`	Message handling priority. See `Priority`.
`responseInterruptable`	`Bool`	Whether the response to this message can be interrupted by a higher-priority message: `YES` (default): Response can be interrupted. `NO`: Response cannot be interrupted.
`text`	`String?`	Text content of the message.

`StateChangeEvent`

Indicates an agent state change event, including complete state information and timestamp.

Used to track session flow and update UI status indicators.

@objc public class StateChangeEvent: NSObject {
    @objc public let state: AgentState
    @objc public let turnId: Int
    @objc public let timestamp: TimeInterval
    @objc public let reason: String
    
    @objc public init(state: AgentState, turnId: Int, timestamp: TimeInterval, reason: String)
    
    public override var description: String {
        return "StateChangeEvent(state: \(state), turnId: \(turnId), timestamp: \(timestamp), reason: \(reason))"
    }
}

Parameter	Type	Description
`state`	`AgentState`	Current agent state. See `AgentState`.
`turnId`	`Int`	Session round ID, used to identify a specific session round.
`timestamp`	`TimeInterval`	Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC).
`reason`	`String`	The reason for the status change.

`InterruptEvent`

Indicates a session disconnection event.

It is usually triggered when the user actively interrupts the AI speech or the system detects a high-priority message. It is used to record the interruption behavior and perform corresponding processing.

@objc public class InterruptEvent: NSObject {
    @objc public let turnId: Int
    @objc public let timestamp: TimeInterval
    @objc public init(turnId: Int, timestamp: TimeInterval) {
        self.turnId = turnId
        self.timestamp = timestamp
    }
    public override var description: String {
        return "InterruptEvent(turnId: \(turnId), timestamp: \(timestamp))"
    }
}

Parameter	Type	Description
`turnId`	`Int`	The ID of the interrupted session round.
`timestamp`	`TimeInterval`	Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC).

`Metric`

Used to record and transmit system performance data.

For example, LLM inference delay, TTS synthesis delay, etc. This data can be used for performance monitoring, system optimization, and user experience improvement.

@objc public class Metric: NSObject {
    @objc public let type: ModuleType
    @objc public let name: String
    @objc public let value: Double
    @objc public let timestamp: TimeInterval
    
    @objc public init(type: ModuleType, name: String, value: Double, timestamp: TimeInterval)
    
    public override var description: String {
        return "Metric(type: \(type.stringValue), name: \(name), value: \(value), timestamp: \(timestamp))"
    }
}

Parameter	Type	Description
`type`	`ModuleType`	Indicator type. See `ModuleType`.
`name`	`String`	The indicator name of a specific performance item.
`value`	`Double`	Metric value, usually latency (milliseconds) or other quantitative metrics.
`timestamp`	`TimeInterval`	Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC).

`ModuleError`

Used to process and report AI module related error information.

Contains error type, error code, error description, and timestamp to facilitate error monitoring, logging, and troubleshooting.

@objc public class ModuleError: NSObject {
    @objc public let type: ModuleType
    @objc public let code: Int
    @objc public let message: String
    @objc public let timestamp: TimeInterval
    @objc public init(type: ModuleType, code: Int, message: String, timestamp: TimeInterval) {
        self.type = type
        self.code = code
        self.message = message
        self.timestamp = timestamp
    }
    public override var description: String {
        return "ModuleError(type: \(type.stringValue), code: \(code), message: \(message), timestamp: \(timestamp))"
    }
}

Parameter	Type	Description
`type`	`ModuleType`	Error type. See `ModuleType`.
`code`	`Int`	Error codes that identify specific error conditions.
`message`	`String`	Provides an error description that details the error.
`timestamp`	`TimeInterval`	Timestamp in milliseconds since Unix epoch (January 1, 1970 UTC).

`Transcription`

Used to represent a user-visible transcription message.

Complete data class for rendering transcripts at the UI level.

@objc public class Transcription: NSObject {
   @objc public let turnId: Int
   @objc public let userId: String
   @objc public let text: String
   @objc public var status: TranscriptionStatus
   @objc public var type: TranscriptionType
}

Parameter	Type	Description
`turnId`	`Int`	Unique identifier for the session turn.
`userId`	`String`	The user identifier associated with this transcript.
`text`	`String`	The actual transcript text content.
`status`	`TranscriptionStatus`	The current status of the transcription. See `TranscriptionStatus`.
`type`	`TranscriptionType`	The current type of transcription (agent or user). See `TranscriptionType`.

`ConversationalAIAPIConfig`

Conversational AI API initialization configuration class.

Contains the configuration parameters required for Conversational AI API initialization, including rtcEngine for audio and video communication, rtmEngine for message communication, and transcription rendering mode settings.

@objc public class ConversationalAIAPIConfig: NSObject {
    @objc public weak var rtcEngine: AgoraRtcEngineKit?
    @objc public weak var rtmEngine: AgoraRtmClientKit?
    @objc public var renderMode: TranscriptionRenderMode
    @objc public var enableLog: Bool
}

Parameter	Type	Description
`rtcEngine`	`AgoraRtcEngineKit?`	Engine instance used for audio and video communication. See `AgoraRtcEngineKit`.
`rtmEngine`	`AgoraRtmClientKit?`	Client instance for real-time messaging. See `AgoraRtmClientKit`.
`renderMode`	`TranscriptionRenderMode`	Translate the rendering mode. See `TranscriptionRenderMode`.
`enableLog`	`Bool`	Whether to enable verbose logging: `true`: Enable verbose logging. `false`: (Default) Do not enable logging.

`ConversationalAIAPIError`

Class for logging and communicating error information.

@objc public class ConversationalAIAPIError: NSObject {
    @objc public let type: ConversationalAIAPIErrorType
    @objc public let code: Int
    @objc public let message: String
    @objc public init(type: ConversationalAIAPIErrorType, code: Int, message: String) {
        self.type = type
        self.code = code
        self.message = message
    }
    public override var description: String {
        return "ConversationalAIAPIError(type: \(type), code: \(code), message: \(message))"
    }
}

Property	Type	Description
`type`	`ConversationalAIAPIErrorType`	Error type. See `ConversationalAIAPIErrorType`.
`code`	`Int`	Error codes that identify specific error conditions.
`message`	`String`	Provides an error description that details the error.

Enum classes

`ChatMessageType`

Message type enumeration.

Used to distinguish different message types within a session.

Value	Description
`text`	(0): Text message type.
`image`	(1): Image message type.
`unknown`	(2): Unknown message type.

`Priority`

Controls the priority with which the agent processes messages.

Value	Description
`interrupt`	(0): High priority. The agent will immediately interrupt the current interaction and process the message. Suitable for urgent or time-sensitive messages.
`append`	(1): Medium priority. The agent will queue the message after the current interaction is completed, suitable for subsequent questions.
`ignore`	(2): Low priority. If the agent is currently interacting, the message will be discarded and only processed when idle, suitable for optional content.

`AgentState`

Represents the different states of the agent during the dialogue.

Value	Description
`idle`	(0): Idle state, the agent is not actively processing.
`silent`	(1): Silent state, the agent remains silent but is ready to listen.
`listening`	(2): Listening state, the agent is actively listening to user input.
`thinking`	(3): Thinking state, the agent is processing user input and generating responses.
`speaking`	(4): Speaking state, the agent is speaking or outputting audio content.
`unknown`	(5): Unknown state, used for fallback processing of unrecognized states.

`ModuleType`

Represents different types of AI modules for performance monitoring.

Value	Description
`llm`	(0): Reasoning with large language models.
`mllm`	(1): Multimodal large language model reasoning.
`tts`	(2): Text-to-speech synthesis.
`unknown`	(3): Unknown module type.

`MessageType`

Used to distinguish different types of messages in the session system.

Value	Description
`metrics`	The indicator message type.
`error`	The error message type.
`user`	User transcription message type.
`interrupt`	The interrupt message type.
`state`	Status message type.
`unknown`	Unknown message type.

`TranscriptionRenderMode`

Transcript rendering mode.

Value	Description
`words`	(0): Word-by-word transcription rendering mode, updated every time a word is processed.
`text`	(1): Sentence-by-sentence transcription rendering mode, updated when the complete sentence is ready.

`TranscriptionType`

Distinguish the source type of the transcribed text.

By identifying whether the transcribed text comes from an agent or a user, it helps manage the conversation flow and interface presentation.

Value	Description
`agent`	Transcript generated by the agent. Usually contains the responses and utterances of the agent assistant, and is used to render the agent voice in the conversational interface.
`user`	Transcribed text from the user. Contains the text converted from the user's voice input, which is used to display the user's voice content in the conversation process.

`TranscriptionStatus`

Indicates the current state of the transcription in the conversation flow.

Used to track and manage the lifecycle status of transcripts.

Value	Description
`inprogress`	(0): Transcription in progress. This state is set while the transcript is being generated or played, indicating that the content is still being processed or streamed.
`end`	(1): Transcription completed. This status is set when text generation ends normally, indicating the natural end of the transcription segment.
`interrupted`	(2): Transcription interrupted. This state is set when text generation is stopped prematurely, which is applicable when it is interrupted by a higher priority message.

`ConversationalAIAPIErrorType`

Used to distinguish different error types in conversational agent systems.

Value	Description
`unknown`	(0): Unknown error type.
`rtcError`	(2): RTC (real-time communication) related errors.
`rtmError`	(3): RTM (Real Time Messaging) related errors.

Was this helpful?