Skip to main content

iOS toolkit API

The iOS toolkit API provides the following classes and methods.

ConversationalAIAPI class

APIDescription
addHandlerAdd an event handler to receive the callback.
removeHandlerRemoves an event handler.
subscribeMessageSubscribe to channel messages.
unsubscribeMessageUnsubscribe from channel messages.
interruptInterrupt the agent's current speech or task processing.
loadAudioSettings[1/2]Set audio best practice parameters for optimal performance.
loadAudioSettings[2/2]Set audio best practice parameters for specific scenarios.
destroyDestroys the API instance and releases all resources.

addHandler

Add an event handler to receive the callback.

You can register a delegate to receive session events, state changes, and other notifications.


_1
@objc func addHandler(handler: ConversationalAIAPIEventHandler)

ParameterTypeDescription
handlerConversationalAIAPIEventHandlerEvent handler, needs to implement ConversationalAIAPIEventHandler protocol. See ConversationalAIAPIEventHandler.

removeHandler

Removes an event handler.


_1
@objc func removeHandler(handler: ConversationalAIAPIEventHandler)

ParameterTypeDescription
handlerConversationalAIAPIEventHandlerThe event handler to remove. See ConversationalAIAPIEventHandler.

subscribeMessage

Subscribe to channel messages.

Set channel parameters and register message subscription callbacks. This method is called when the channel changes, usually when the agent starts.


_1
@objc func subscribeMessage(channelName: String, completion: @escaping (ConversationalAIAPIError?) -> Void)

ParameterTypeDescription
channelNameStringThe name of the channel to subscribe to.
completion(ConversationalAIAPIError?) -> VoidA callback that returns error information when subscription fails. See ConversationalAIAPIError.

unsubscribeMessage

Unsubscribe from channel messages.

Calling this method can stop receiving messages from the specified channel, which is suitable for scenarios where the connection with the agent is disconnected.


_1
@objc func unsubscribeMessage(channelName: String, completion: @escaping (ConversationalAIAPIError?) -> Void)

ParameterTypeDescription
channelNameStringThe name of the channel to unsubscribe from.
completion(ConversationalAIAPIError?) -> VoidThe callback after the unsubscription operation is completed. You can get error information through this callback. See ConversationalAIAPIError.

interrupt

Interrupt the agent's current speech or task processing.

This method can be used to interrupt an agent that is currently speaking or processing a task.


_1
@objc func interrupt(agentUserId: String, completion: @escaping (ConversationalAIAPIError?) -> Void)

info

If error has a value, it means the message sending failed. If error is nil, it means the message was sent successfully, but it does not guarantee that the agent was successfully interrupted.

ParameterTypeDescription
agentUserIdStringThe RTM user ID of the agent, must be globally unique.
completion(ConversationalAIAPIError?) -> VoidThe callback function when the operation is completed. You can get the result or error information of the operation through the parameters of the callback. See ConversationalAIAPIError.

loadAudioSettings[1/2]

Set audio best practice parameters for optimal performance.

Sets the audio parameters needed for optimal performance in agent conversations. By default, .aiClient Audio Scene is used.


_1
@objc func loadAudioSettings()

info

To enable audio best practices, you must call this method before each joinChannel call.

Sample code:


_5
// Set audio best practice parameters before joining the channel
_5
api.loadAudioSettings() // Use default scenario
_5
_5
// Then join the channel
_5
rtcEngine.joinChannel(byToken: token, channelId: channelName, info: nil, uid: userId)

loadAudioSettings[2/2]

Set audio best practice parameters for specific scenarios.

This method allows you to configure the audio parameters required for optimal performance in your agent conversations.


_1
@objc func loadAudioSettings(scenario: AgoraAudioScenario)

info

If you need to enable audio best practices, you must call this method before each joinChannel call.

ParameterTypeDescription
scenarioAgoraAudioScenarioAudio scenario for optimal performance.

destroy

Destroys the API instance and releases all resources.

Calling this method destroys the current API instance and releases all resources. After calling this method, the instance cannot be used again. Please call this method when you no longer need to use the API.


_1
@objc func destroy()

ConversationalAIAPIEventHandler class

APIDescription
onAgentStateChangedCallback when the agent status changes.
onAgentInterruptedThe callback triggered when an interrupt event occurs.
onAgentMetricsA callback that is triggered when performance metrics are available.
onAgentErrorCallback when an error occurs in the AI module.
onTranscriptionUpdatedTranscription content update callback.

onAgentStateChanged

Callback when the agent status changes.

This callback is triggered when the agent state changes, such as switching from idle to silent, listening, thinking, or speaking. You can use this callback to update the user interface or track the flow of the conversation.


_1
@objc func onAgentStateChanged(agentUserId: String, event: StateChangeEvent)

ParameterTypeDescription
agentUserIdStringThe RTM user ID of the agent.
eventStateChangeEventAgent status change event, including status, round ID, timestamp and reason. See StateChangeEvent.

onAgentInterrupted

The callback triggered when an interrupt event occurs.


_1
@objc func onAgentInterrupted(agentUserId: String, event: InterruptEvent)

info

This callback is not necessarily synchronized with the agent's state, so it is not recommended to process business logic in this callback.

ParameterTypeDescription
agentUserIdStringThe RTM user ID of the agent.
eventInterruptEventInterrupt event, including round ID and timestamp. See InterruptEvent.

onAgentMetrics

A callback that is triggered when performance metrics are available.

This callback provides performance data, such as LLM inference latency and TTS speech synthesis latency, for monitoring system performance.


_1
@objc func onAgentMetrics(agentUserId: String, metrics: Metric)

info

This performance indicator callback is not necessarily synchronized with the agent's state, so it is not recommended to process business logic in this callback.

ParameterTypeDescription
agentUserIdStringThe RTM user ID of the agent.
metricsMetricPerformance indicator, including type, value and timestamp. See Metric.

onAgentError

Callback when an error occurs in the AI module.

This callback is called when an error occurs in a module component (such as LLM, TTS, etc.). It can be used for error monitoring, logging, and implementing service degradation strategies.


_1
@objc func onAgentError(agentUserId: String, error: ModuleError)

info

This callback is not necessarily synchronized with the state of the agent, so it is not recommended to process business logic in this callback.

ParameterTypeDescription
agentUserIdStringThe RTM user ID of the agent.
errorModuleErrorModule error, including type, error code, error message and timestamp. See ModuleError.

onTranscriptionUpdated

Transcription content update callback.

This callback is triggered when the speech transcription content in the session is updated.


_1
@objc func onTranscriptionUpdated(agentUserId: String, transcription: Transcription)

ParameterTypeDescription
agentUserIdStringThe RTM user ID of the agent.
transcriptionTranscriptionTranscript data, including text content, status and metadata. See Transcription.

Structures

StateChangeEvent

Indicates an agent state change event, including complete state information and timestamp.

Used to track session flow and update UI status indicators.


_12
@objc public class StateChangeEvent: NSObject {
_12
@objc public let state: AgentState
_12
@objc public let turnId: Int
_12
@objc public let timestamp: TimeInterval
_12
@objc public let reason: String
_12
_12
@objc public init(state: AgentState, turnId: Int, timestamp: TimeInterval, reason: String)
_12
_12
public override var description: String {
_12
return "StateChangeEvent(state: \(state), turnId: \(turnId), timestamp: \(timestamp), reason: \(reason))"
_12
}
_12
}

ParameterTypeDescription
stateAgentStateCurrent agent state. See AgentState.
turnIdIntSession round ID, used to identify a specific session round.
timestampTimeIntervalTimestamp in milliseconds since Unix epoch (January 1, 1970 UTC).
reasonStringThe reason for the status change.

InterruptEvent

Indicates a session disconnection event.

It is usually triggered when the user actively interrupts the AI speech or the system detects a high-priority message. It is used to record the interruption behavior and perform corresponding processing.


_13
@objc public class InterruptEvent: NSObject {
_13
@objc public let turnId: Int
_13
@objc public let timestamp: TimeInterval
_13
_13
@objc public init(turnId: Int, timestamp: TimeInterval) {
_13
self.turnId = turnId
_13
self.timestamp = timestamp
_13
}
_13
_13
public override var description: String {
_13
return "InterruptEvent(turnId: \(turnId), timestamp: \(timestamp))"
_13
}
_13
}

ParameterTypeDescription
turnIdIntThe ID of the interrupted session round.
timestampTimeIntervalTimestamp in milliseconds since Unix epoch (January 1, 1970 UTC).

Metric

Used to record and transmit system performance data.

For example, LLM inference delay, TTS synthesis delay, etc. This data can be used for performance monitoring, system optimization, and user experience improvement.


_12
@objc public class Metric: NSObject {
_12
@objc public let type: ModuleType
_12
@objc public let name: String
_12
@objc public let value: Double
_12
@objc public let timestamp: TimeInterval
_12
_12
@objc public init(type: ModuleType, name: String, value: Double, timestamp: TimeInterval)
_12
_12
public override var description: String {
_12
return "Metric(type: \(type.stringValue), name: \(name), value: \(value), timestamp: \(timestamp))"
_12
}
_12
}

ParameterTypeDescription
typeModuleTypeIndicator type. See ModuleType.
nameStringThe indicator name of a specific performance item.
valueDoubleMetric value, usually latency (milliseconds) or other quantitative metrics.
timestampTimeIntervalTimestamp in milliseconds since Unix epoch (January 1, 1970 UTC).

ModuleError

Used to process and report AI module related error information.

Contains error type, error code, error description, and timestamp to facilitate error monitoring, logging, and troubleshooting.


_17
@objc public class ModuleError: NSObject {
_17
@objc public let type: ModuleType
_17
@objc public let code: Int
_17
@objc public let message: String
_17
@objc public let timestamp: TimeInterval
_17
_17
@objc public init(type: ModuleType, code: Int, message: String, timestamp: TimeInterval) {
_17
self.type = type
_17
self.code = code
_17
self.message = message
_17
self.timestamp = timestamp
_17
}
_17
_17
public override var description: String {
_17
return "ModuleError(type: \(type.stringValue), code: \(code), message: \(message), timestamp: \(timestamp))"
_17
}
_17
}

ParameterTypeDescription
typeModuleTypeError type. See ModuleType.
codeIntError codes that identify specific error conditions.
messageStringProvides an error description that details the error.
timestampTimeIntervalTimestamp in milliseconds since Unix epoch (January 1, 1970 UTC).

Transcription

Used to represent a user-visible transcription message.

Complete data class for rendering transcripts at the UI level.


_7
@objc public class Transcription: NSObject {
_7
@objc public let turnId: Int
_7
@objc public let userId: String
_7
@objc public let text: String
_7
@objc public var status: TranscriptionStatus
_7
@objc public var type: TranscriptionType
_7
}

ParameterTypeDescription
turnIdIntUnique identifier for the session turn.
userIdStringThe user identifier associated with this transcript.
textStringThe actual transcript text content.
statusTranscriptionStatusThe current status of the transcription. See TranscriptionStatus.
typeTranscriptionTypeThe current type of transcription (agent or user). See TranscriptionType.

ConversationalAIAPIConfig

Conversational AI API initialization configuration class.

Contains the configuration parameters required for Conversational AI API initialization, including rtcEngine for audio and video communication, rtmEngine for message communication, and transcription rendering mode settings.


_6
@objc public class ConversationalAIAPIConfig: NSObject {
_6
@objc public weak var rtcEngine: AgoraRtcEngineKit?
_6
@objc public weak var rtmEngine: AgoraRtmClientKit?
_6
@objc public var renderMode: TranscriptionRenderMode
_6
@objc public var enableLog: Bool
_6
}

ParameterTypeDescription
rtcEngineAgoraRtcEngineKit?Engine instance used for audio and video communication. See AgoraRtcEngineKit.
rtmEngineAgoraRtmClientKit?Client instance for real-time messaging. See AgoraRtmClientKit.
renderModeTranscriptionRenderModeTranslate the rendering mode. See TranscriptionRenderMode.
enableLogBoolWhether to enable verbose logging: true: Enable verbose logging. false: (Default) Do not enable logging.

ConversationalAIAPIError

Class for logging and communicating error information.


_15
@objc public class ConversationalAIAPIError: NSObject {
_15
@objc public let type: ConversationalAIAPIErrorType
_15
@objc public let code: Int
_15
@objc public let message: String
_15
_15
@objc public init(type: ConversationalAIAPIErrorType, code: Int, message: String) {
_15
self.type = type
_15
self.code = code
_15
self.message = message
_15
}
_15
_15
public override var description: String {
_15
return "ConversationalAIAPIError(type: \(type), code: \(code), message: \(message))"
_15
}
_15
}

PropertyTypeDescription
typeConversationalAIAPIErrorTypeError type. See ConversationalAIAPIErrorType.
codeIntError codes that identify specific error conditions.
messageStringProvides an error description that details the error.

Enum classes

Priority

Controls the priority with which the agent processes messages.

ValueDescription
interrupt(0): High priority. The agent will immediately interrupt the current interaction and process the message. Suitable for urgent or time-sensitive messages.
append(1): Medium priority. The agent will queue the message after the current interaction is completed, suitable for subsequent questions.
ignore(2): Low priority. If the agent is currently interacting, the message will be discarded and only processed when idle, suitable for optional content.

AgentState

Represents the different states of the agent during the dialogue.

ValueDescription
idle(0): Idle state, the agent is not actively processing.
silent(1): Silent state, the agent remains silent but is ready to listen.
listening(2): Listening state, the agent is actively listening to user input.
thinking(3): Thinking state, the agent is processing user input and generating responses.
speaking(4): Speaking state, the agent is speaking or outputting audio content.
unknown(5): Unknown state, used for fallback processing of unrecognized states.

ModuleType

Represents different types of AI modules for performance monitoring.

ValueDescription
llm(0): Reasoning with large language models.
mllm(1): Multimodal large language model reasoning.
tts(2): Text-to-speech synthesis.
unknown(3): Unknown module type.

MessageType

Used to distinguish different types of messages in the session system.

ValueDescription
metricsThe indicator message type.
errorThe error message type.
userUser transcription message type.
interruptThe interrupt message type.
stateStatus message type.
unknownUnknown message type.

TranscriptionRenderMode

Transcript rendering mode.

ValueDescription
words(0): Word-by-word transcription rendering mode, updated every time a word is processed.
text(1): Sentence-by-sentence transcription rendering mode, updated when the complete sentence is ready.

TranscriptionType

Distinguish the source type of the transcribed text.

By identifying whether the transcribed text comes from an agent or a user, it helps manage the conversation flow and interface presentation.

ValueDescription
agentTranscript generated by the agent. Usually contains the responses and utterances of the agent assistant, and is used to render the agent voice in the conversational interface.
userTranscribed text from the user. Contains the text converted from the user's voice input, which is used to display the user's voice content in the conversation process.

TranscriptionStatus

Indicates the current state of the transcription in the conversation flow.

Used to track and manage the lifecycle status of transcripts.

ValueDescription
inprogress(0): Transcription in progress. This state is set while the transcript is being generated or played, indicating that the content is still being processed or streamed.
end(1): Transcription completed. This status is set when text generation ends normally, indicating the natural end of the transcription segment.
interrupted(2): Transcription interrupted. This state is set when text generation is stopped prematurely, which is applicable when it is interrupted by a higher priority message.

ConversationalAIAPIErrorType

Used to distinguish different error types in conversational agent systems.

ValueDescription
unknown(0): Unknown error type.
rtcError(2): RTC (real-time communication) related errors.
rtmError(3): RTM (Real Time Messaging) related errors.