Skip to main content

Typescript SDK API reference

Full API reference for the Agora Conversational AI TypeScript SDK.

AgoraClient

AgoraClient extends the Fern-generated base client with regional domain pool support and three authentication modes.


_1
import { AgoraClient, Area } from 'agora-agent-sdk';

Constructor


_1
new AgoraClient(options: AgoraClient.Options)

The authentication mode is resolved automatically from the options you provide.

OptionTypeRequiredDescription
areaAreaYesRegion for API routing
appIdstringYesAgora App ID
appCertificatestringYesAgora App Certificate. Keep this secret and never expose it client-side
customerIdstringNoCustomer ID for Basic Auth
customerSecretstringNoCustomer Secret for Basic Auth
authTokenstringNoPre-built agora token=<value> string
timeoutnumberNoRequest timeout in milliseconds
maxRetriesnumberNoMaximum retry attempts
fetchtypeof fetchNoCustom fetch implementation for unsupported runtimes

Authentication mode is resolved from the options you provide:

Options providedResolved authMode
customerId + customerSecret"basic"
authToken"token"
Neither"app-credentials"

See Authentication for details on each mode.

Properties

The following read-only properties are available on any AgoraClient instance.

PropertyTypeDescription
appIdstringThe Agora App ID
appCertificatestringThe Agora App Certificate
authModeAgoraAuthModeThe resolved authentication mode
poolPoolThe underlying domain pool instance used for regional routing

Methods

The following methods are available in addition to the Fern-generated sub-client methods.

nextRegion()

Cycles to the next region prefix in the domain pool. Call this after a request failure to try a different regional endpoint.


_1
client.nextRegion();

selectBestDomain(signal?)

Triggers a manual DNS resolution check to select the best domain suffix. Runs automatically every 30 seconds, but can be called manually after a network change.


_1
await client.selectBestDomain();

ParameterTypeDescription
signalAbortSignalOptional abort signal to cancel the DNS check

getCurrentURL()

Returns the full API URL currently in use as a string.


_2
const url = client.getCurrentURL();
_2
// Example: 'https://api-us-west-1.agora.io/api/conversational-ai-agent'

Sub-clients

AgoraClient exposes Fern-generated sub-clients for direct REST API access. You typically do not need these when using the agentkit layer.

PropertyDescription
client.agentsStart, stop, update, speak, interrupt, get history, list agents
client.telephonyTelephony operations
client.phoneNumbersPhone number management

Sub-clients are lazily initialized on first access. For most use cases, prefer the AgentSession API over calling client.agents directly.

For full method signatures and request parameters, see the REST API reference.

Agent

Agent is an immutable configuration object. Each builder method returns a new Agent instance — the original is never modified. Define one Agent at startup and call createSession() on it for each user conversation.


_1
import { Agent } from 'agora-agent-sdk';

Constructor


_1
new Agent(options?: AgentOptions)

All options are optional. Use the builder methods to set vendor configuration after construction.

OptionTypeDefaultDescription
namestringundefinedAgent name, used as the default session name
instructionsstringundefinedLLM system prompt
greetingstringundefinedFirst message spoken when the session starts
failureMessagestringundefinedMessage spoken when an LLM call fails
maxHistorynumberundefinedMaximum conversation turns kept in LLM context
turnDetectionTurnDetectionConfigundefinedVoice activity detection settings
interruptionInterruptionConfigundefinedUnified interruption control settings
salSalConfigundefinedSelective Attention Locking configuration
avatarAvatarConfigundefinedAvatar configuration
advancedFeaturesAdvancedFeaturesundefinedEnable MLLM mode, AI-VAD, and other advanced features
parametersSessionParamsundefinedSession parameters including silence and farewell config
geofenceGeofenceConfigundefinedRegional access restriction
labelsLabelsundefinedCustom key-value labels returned in notification callbacks
rtcRtcConfigundefinedRTC media encryption
fillerWordsFillerWordsConfigundefinedFiller words played while waiting for the LLM response

Builder methods

All builder methods return a new Agent instance. The original is never modified.

withLlm(vendor)

Sets the LLM vendor. Pass an instance of OpenAI, AzureOpenAI, Anthropic, or Gemini.


_1
withLlm(vendor: BaseLLM): Agent<TTSSampleRate>

withTts(vendor)

Sets the TTS vendor. The sample rate type is captured and tracked for avatar compatibility.


_1
withTts<SR extends number>(vendor: BaseTTS<SR>): Agent<SR>

withStt(vendor)

Sets the STT vendor. Pass an instance of any STT vendor class.


_1
withStt(vendor: BaseSTT): Agent<TTSSampleRate>

withMllm(vendor)

Sets the MLLM vendor for multimodal mode. Pass OpenAIRealtime, GeminiLive, or VertexAI. Requires advancedFeatures: { enable_mllm: true } in the constructor. Calling withMllm() automatically sets mllm.enable = true.


_1
withMllm(vendor: BaseMLLM): Agent<TTSSampleRate>

withAvatar(vendor)

Sets the avatar vendor. The this constraint enforces at compile time that the agent's TTS sample rate matches the avatar's required rate.


_4
withAvatar<RequiredSR extends number>(
_4
this: Agent<RequiredSR>,
_4
vendor: BaseAvatar<RequiredSR>
_4
): Agent<RequiredSR>

withTurnDetection(config)

Configures cascading-flow turn detection. Use config.start_of_speech and config.end_of_speech for SOS/EOS detection. Use withInterruption() for interruption behavior and MLLM vendor turnDetection for MLLM turn detection.


_1
withTurnDetection(config: TurnDetectionConfig): Agent<TTSSampleRate>

withInterruption(config)

Configures unified interruption behavior using the top-level interruption object. Use this for start_of_speech and keywords interruption modes.


_1
withInterruption(config: InterruptionConfig): Agent<TTSSampleRate>

withInstructions(text)

Overrides the LLM system prompt on a new Agent instance.


_1
withInstructions(instructions: string): Agent<TTSSampleRate>

withGreeting(text)

Overrides the greeting message on a new Agent instance.


_1
withGreeting(greeting: string): Agent<TTSSampleRate>

withName(name)

Overrides the agent name on a new Agent instance.


_1
withName(name: string): Agent<TTSSampleRate>

Other builder methods

The following methods follow the same pattern — each returns a new Agent instance with the updated configuration.

MethodParameter typeDescription
withSal(config)SalConfigSet Selective Attention Locking configuration
withAdvancedFeatures(features)AdvancedFeaturesSet advanced features
withTools(enabled)booleanEnable or disable MCP tool invocation
withParameters(parameters)SessionParamsSet session parameters
withFailureMessage(message)stringSet the message spoken when the LLM fails
withMaxHistory(n)numberSet the maximum conversation history length
withGeofence(geofence)GeofenceConfigSet geofence configuration
withLabels(labels)LabelsSet custom labels
withRtc(rtc)RtcConfigSet RTC configuration
withFillerWords(fillerWords)FillerWordsConfigSet filler words configuration

createSession(client, options)

Creates an AgentSession bound to a specific client and channel. Does not start the agent — call session.start() to join the channel.


_4
createSession(
_4
client: AgoraClient,
_4
options: SessionOptions,
_4
): AgentSession

SessionOptions fields:

OptionTypeRequiredDescription
channelstringYesChannel name to join
agentUidstringYesThe agent's RTC UID
remoteUidsstring[]YesRemote user UIDs the agent listens and responds to
namestringNoSession name. Defaults to agent name or agent-{timestamp}
tokenstringNoPre-built RTC+RTM token. Omit to auto-generate from app credentials
expiresInnumberNoToken lifetime in seconds. Only applies when the token is auto-generated. Valid range: 1–86400. Use ExpiresIn helpers for clarity
idleTimeoutnumberNoSeconds before the agent auto-exits when no audio is detected. 0 disables the timeout
enableStringUidbooleanNoUse string UIDs instead of numeric UIDs
presetstring | AgentPreset[]NoSession-level preset IDs to use as the base ASR/LLM/TTS configuration. Accepts a comma-separated string or an array of AgentPresets.* values
pipelineIdstringNoPublished AI Studio pipeline ID to use as the base configuration
debugbooleanNoLog API requests to the console

preset is session-scoped because the underlying Agora start/join API applies presets per session, not per reusable Agent definition.

When you omit credentials for supported reseller-backed vendor models, AgentKit infers the matching session preset automatically:

  • Deepgram STT: nova-2, nova-3
  • OpenAI LLM: gpt-4o-mini, gpt-4.1-mini, gpt-5-nano, gpt-5-mini
  • OpenAI TTS: tts-1
  • MiniMax TTS: speech-2.6-turbo, speech-2.8-turbo

If you provide your own vendor API key for those same models, AgentKit keeps the request in BYOK mode and does not infer a preset.

Properties

Read-only properties available on any Agent instance.

PropertyTypeDescription
namestring | undefinedAgent name
instructionsstring | undefinedLLM system prompt
greetingstring | undefinedGreeting message
failureMessagestring | undefinedMessage spoken when LLM fails
maxHistorynumber | undefinedMaximum conversation history length
llmLlmConfig | undefinedLLM configuration
ttsTtsConfig | undefinedTTS configuration
sttSttConfig | undefinedSTT configuration
mllmMllmConfig | undefinedMLLM configuration
avatarAvatarConfig | undefinedAvatar configuration
turnDetectionTurnDetectionConfig | undefinedTurn detection configuration
interruptionInterruptionConfig | undefinedInterruption configuration
salSalConfig | undefinedSAL configuration
advancedFeaturesAdvancedFeatures | undefinedAdvanced features
parametersSessionParams | undefinedSession parameters
geofenceGeofenceConfig | undefinedGeofence configuration
labelsLabels | undefinedCustom labels
rtcRtcConfig | undefinedRTC configuration
fillerWordsFillerWordsConfig | undefinedFiller words configuration
configAgentOptionsFull read-only configuration snapshot

toProperties()

Low-level method to convert the agent configuration to the Fern request format. Used internally by AgentSession.start(). You typically do not need to call this directly unless building custom request bodies.

AgentSession

AgentSession manages the full lifecycle of a running agent. Obtain an AgentSession by calling agent.createSession() — do not call the constructor directly.


_1
import { AgentSession } from 'agora-agent-sdk';

State machine

A session progresses through the following states:


_4
idle ──► starting ──► running ──► stopping ──► stopped
_4
_4
_4
error

TransitionTrigger
idle → startingstart() called
starting → runningAPI responds with agent ID
starting → errorAPI request fails
running → stoppingstop() called
stopping → stoppedAPI confirms agent stopped
stopping → errorStop request fails and agent was not already stopped
running → errorUnrecoverable error during interaction

start() can also be called from stopped or error state to restart the session.

Methods

The following methods are available on an AgentSession instance.

start()

Starts the agent session. Generates tokens if not provided, sends the start request, and returns the agent ID. Resolves explicit preset values and also infers reseller presets from supported vendor configs when credentials are omitted.


_1
start(): Promise<string>

  • Transitions: idle / stopped / errorstartingrunning
  • Throws if called in starting, running, or stopping state
  • Throws if avatar configuration has a TTS sample rate mismatch

stop()

Stops the agent session and removes the agent from the channel. If the agent has already stopped — for example due to idle timeout — resolves silently rather than throwing a 404 error.


_1
stop(): Promise<void>

  • Transitions: runningstoppingstopped
  • Throws if called outside running state

say(text, options?)

Instructs the agent to speak the given text.


_1
say(text: string, options?: SayOptions): Promise<void>

ParameterTypeRequiredDescription
textstringYesThe text for the agent to speak
options.prioritySpeakPriorityNoMessage priority
options.interruptablebooleanNoWhether this message can be interrupted by the user
  • Only valid in running state

interrupt()

Interrupts the agent's current speech.


_1
interrupt(): Promise<void>

  • Only valid in running state

update(config)

Updates the agent configuration mid-session without restarting. Accepts a partial configuration object in REST API format.


_1
update(config: AgentConfigUpdate): Promise<void>

  • Only valid in running state

getHistory()

Fetches the conversation history for this session. Requires a valid agent ID — start() must have been called successfully.


_1
getHistory(): Promise<ConversationHistory>

getTurns()

Fetches turn-by-turn analytics for this session, including start/end events and latency metrics. Requires a valid agent ID — start() must have been called successfully.


_1
getTurns(): Promise<ConversationTurns>

getInfo()

Fetches current agent metadata from the API. Requires a valid agent ID.


_1
getInfo(): Promise<SessionInfo>

on(event, handler)

Subscribes to a session event. Register handlers before calling start() to avoid missing the started event.


_1
on<T>(event: AgentSessionEvent, handler: AgentSessionEventHandler<T>): void

off(event, handler)

Unsubscribes a previously registered event handler.


_1
off<T>(event: AgentSessionEvent, handler: AgentSessionEventHandler<T>): void

Events

The session emits the following events. See AgentSessionEvent and AgentSessionEventHandler for type details.

EventPayload typeDescription
"started"{ agentId: string }Agent successfully joined the channel
"stopped"{ agentId: string }Agent left the channel
"error"ErrorAn unrecoverable error occurred

Properties

The following read-only properties are available on any AgentSession instance.

PropertyTypeDescription
statusstringCurrent session state. One of "idle", "starting", "running", "stopping", "stopped", "error"
idstring | nullAgent ID, populated after start() resolves
agentAgentThe agent configuration this session was created from
appIdstringThe Agora App ID for this session
rawAgentsClientDirect access to the Fern-generated AgentsClient for advanced operations

Using session.raw

Use session.raw to call REST API endpoints not yet exposed by the agentkit layer. You must pass appid and agentId manually.


_4
await session.raw.someNewEndpoint({
_4
appid: session.appId,
_4
agentId: session.id!,
_4
});

Presets and BYOK

preset lives on the session because Agora applies presets when the agent joins a channel.

AgentKit supports both explicit presets and BYOK:

  • Pass preset directly on agent.createSession(...) when you want to choose the base reseller configuration yourself.
  • Provide vendor credentials for preset-capable models when you want full BYOK behavior.
  • Omit credentials for supported reseller models when you want AgentKit to infer the matching preset automatically.

Supported inferred preset models:

  • Deepgram STT: nova-2, nova-3
  • OpenAI LLM: gpt-4o-mini, gpt-4.1-mini, gpt-5-nano, gpt-5-mini
  • OpenAI TTS: tts-1
  • MiniMax TTS: speech-2.6-turbo, speech-2.8-turbo

Vendors

All vendor classes are imported from agora-agent-sdk. Pass vendor instances to the Agent builder methods.

LLM vendors

Use with withLlm().

OpenAI


_1
new OpenAI(options: OpenAIOptions)

OptionTypeRequiredDescription
apiKeystringUsuallyOpenAI API key
modelstringYesModel name, for example 'gpt-4o-mini'
urlstringNoAPI endpoint URL. Default: https://api.openai.com/v1/chat/completions
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
outputModalitiesstring[]NoOutput modalities
paramsRecord<string, unknown>NoAdditional LLM parameters passed to the model
headersRecord<string, string>NoCustom HTTP headers forwarded to the LLM provider
greetingConfigsLlmGreetingConfigsNoGreeting playback configuration
templateVariablesRecord<string, string>NoTemplate variables for messages

apiKey is optional for the following reseller preset models: gpt-4o-mini, gpt-4.1-mini, gpt-5-nano, gpt-5-mini. If apiKey is omitted for one of those models, AgentKit infers the matching session preset. This no-key branch is only available with the default OpenAI endpoint and without a custom vendor hint. If apiKey is provided, AgentKit uses standard BYOK behavior instead.

AzureOpenAI


_1
new AzureOpenAI(options: AzureOpenAIOptions)

OptionTypeRequiredDescription
apiKeystringYesAzure OpenAI API key
modelstringYesModel or deployment name
resourceNamestringYesAzure resource name
deploymentNamestringYesDeployment name in Azure
apiVersionstringNoAzure API version. Default: '2023-05-15'
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
outputModalitiesstring[]NoOutput modalities
paramsRecord<string, unknown>NoAdditional LLM parameters
headersRecord<string, string>NoCustom HTTP headers forwarded to the LLM provider
greetingConfigsLlmGreetingConfigsNoGreeting playback configuration
templateVariablesRecord<string, string>NoTemplate variables for messages

Anthropic


_1
new Anthropic(options: AnthropicOptions)

OptionTypeRequiredDescription
apiKeystringYesAnthropic API key
modelstringYesModel name, for example 'claude-3-5-sonnet-20241022'
urlstringNoAPI endpoint URL. Default: https://api.anthropic.com/v1/messages
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
outputModalitiesstring[]NoOutput modalities
paramsRecord<string, unknown>NoAdditional LLM parameters
headersRecord<string, string>NoCustom HTTP headers forwarded to the LLM provider
greetingConfigsLlmGreetingConfigsNoGreeting playback configuration
templateVariablesRecord<string, string>NoTemplate variables for messages

Gemini


_1
new Gemini(options: GeminiOptions)

OptionTypeRequiredDescription
apiKeystringYesGoogle API key
modelstringYesModel name, for example 'gemini-pro'
urlstringNoAPI endpoint URL. Default: https://generativelanguage.googleapis.com/v1beta/models
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
outputModalitiesstring[]NoOutput modalities
paramsRecord<string, unknown>NoAdditional LLM parameters
headersRecord<string, string>NoCustom HTTP headers forwarded to the LLM provider
greetingConfigsLlmGreetingConfigsNoGreeting playback configuration
templateVariablesRecord<string, string>NoTemplate variables for messages

TTS vendors

Use with withTts(). The sampleRate option determines avatar compatibility — see withAvatar().

ElevenLabsTTS


_1
new ElevenLabsTTS<SR extends ElevenLabsSampleRate>(options: ElevenLabsTTSOptions<SR>)

OptionTypeRequiredDescription
keystringYesElevenLabs API key
modelIdstringYesModel ID, for example 'eleven_flash_v2_5'
voiceIdstringYesVoice ID
sampleRate16000 | 22050 | 24000 | 44100NoAudio sample rate in Hz
baseUrlstringNoWebSocket base URL
skipPatternsnumber[]NoSkip patterns for bracketed content

MicrosoftTTS


_1
new MicrosoftTTS<SR extends MicrosoftSampleRate>(options: MicrosoftTTSOptions<SR>)

OptionTypeRequiredDescription
keystringYesAzure Speech API key
regionstringYesAzure region, for example 'eastus'
voiceNamestringYesVoice name, for example 'en-US-JennyNeural'
sampleRate16000 | 24000 | 48000NoAudio sample rate in Hz
skipPatternsnumber[]NoSkip patterns for bracketed content

OpenAITTS

Fixed at 24,000 Hz — no configurable sample rate.


_1
new OpenAITTS(options: OpenAITTSOptions)

OptionTypeRequiredDescription
apiKeystringUsuallyOpenAI API key
voicestringYesVoice name: 'alloy', 'echo', 'fable', 'onyx', 'nova', or 'shimmer'
modelstringNoModel name, for example 'tts-1' or 'tts-1-hd'
responseFormatstringNoAudio format, for example 'pcm'
speednumberNoSpeech speed multiplier
skipPatternsnumber[]NoSkip patterns for bracketed content

apiKey is optional only for the reseller-backed tts-1 preset path. If omitted with model: 'tts-1' or no explicit model, AgentKit infers openai_tts_1. If provided, the request stays in BYOK mode.

CartesiaTTS


_1
new CartesiaTTS<SR extends CartesiaSampleRate>(options: CartesiaTTSOptions<SR>)

OptionTypeRequiredDescription
apiKeystringYesCartesia API key
voiceIdstringYesVoice ID (serialized as {"mode": "id", "id": "..."})
modelIdstringNoModel ID
sampleRate8000 | 16000 | 22050 | 24000 | 44100 | 48000NoAudio sample rate in Hz
skipPatternsnumber[]NoSkip patterns for bracketed content

Other TTS vendors

ClassKey parameters
GoogleTTSkey, voiceName, languageCode?
AmazonTTSaccessKey, secretKey, region, voiceId
DeepgramTTSapiKey, model, baseUrl?, sampleRate?, params?
HumeAITTSkey, configId?
RimeTTSkey, speaker, modelId?, lang?, samplingRate?, speedAlpha?
FishAudioTTSkey, referenceId
MiniMaxTTSkey?, groupId?, model, voiceId?, url?
MurfTTSkey, voiceId, style?
SarvamTTSkey, speaker, targetLanguageCode

For MiniMaxTTS, key is optional only for reseller-backed models: speech-2.6-turbo, speech-2.8-turbo. If key is omitted for one of those models, AgentKit infers the matching session preset. In that preset-backed path, groupId, voiceId, and url are optional overrides rather than required fields. If key is provided, AgentKit uses BYOK.

STT vendors

Use with withStt().

DeepgramSTT


_1
new DeepgramSTT(options?: DeepgramSTTOptions)

All options are optional.

OptionTypeDescription
apiKeystringDeepgram API key. Optional for nova-2 and nova-3 reseller preset usage
modelstringModel name, for example 'nova-2' or 'enhanced'
languagestringLanguage code, for example 'en-US'
smartFormatbooleanEnable smart formatting
punctuationbooleanEnable punctuation
additionalParamsRecord<string, unknown>Additional vendor parameters

If apiKey is omitted for nova-2 or nova-3, AgentKit infers the matching Deepgram reseller preset. For all other Deepgram models, TypeScript requires apiKey.

Other STT vendors

ClassKey parameters
SpeechmaticsSTTapiKey, language
MicrosoftSTTkey, region, language?
OpenAISTTapiKey, model?, language?
GoogleSTTapiKey, language?
AmazonSTTaccessKey, secretKey, region, language?
AssemblyAISTTapiKey, language?
AresSTTlanguage?
SarvamSTTapiKey, language

MLLM vendors

Use with withMllm() for multimodal end-to-end audio processing without separate STT or TTS steps. Requires advancedFeatures: { enable_mllm: true } in the Agent constructor.

OpenAIRealtime


_1
new OpenAIRealtime(options: OpenAIRealtimeOptions)

OptionTypeRequiredDescription
apiKeystringYesOpenAI API key
modelstringNoModel name, for example 'gpt-4o-realtime-preview'
urlstringNoWebSocket URL
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage played when the model call fails
maxHistorynumberNoMaximum conversation history length
predefinedToolsstring[]NoPredefined tools, for example ['_publish_message']
inputModalitiesstring[]NoInput modalities, for example ['audio']
outputModalitiesstring[]NoOutput modalities, for example ['text', 'audio']
messagesRecord<string, unknown>[]NoConversation messages for short-term memory
paramsRecord<string, unknown>NoAdditional MLLM parameters
turnDetectionMllmTurnDetectionConfigNoMLLM turn detection configuration; overrides top-level turnDetection

GeminiLive


_1
new GeminiLive(options: GeminiLiveOptions)

OptionTypeRequiredDescription
apiKeystringYesGoogle API key
modelstringYesModel name, for example 'gemini-live-2.5-flash'
urlstringNoWebSocket URL
instructionsstringNoSystem instructions for the model
voicestringNoVoice name, for example 'Aoede' or 'Charon'
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage played when the model call fails
maxHistorynumberNoMaximum conversation history length
predefinedToolsstring[]NoPredefined tools, for example ['_publish_message']
inputModalitiesstring[]NoInput modalities
outputModalitiesstring[]NoOutput modalities
messagesRecord<string, unknown>[]NoConversation messages for short-term memory
additionalParamsRecord<string, unknown>NoAdditional parameters
turnDetectionMllmTurnDetectionConfigNoMLLM turn detection configuration; overrides top-level turnDetection

VertexAI


_1
new VertexAI(options: VertexAIOptions)

OptionTypeRequiredDescription
modelstringYesModel name, for example 'gemini-live-2.5-flash-preview-native-audio-09-2025'
projectIdstringYesGoogle Cloud project ID
locationstringYesGoogle Cloud location or region
adcCredentialsStringstringYesApplication Default Credentials JSON string
urlstringNoWebSocket URL
instructionsstringNoSystem instructions for the model
voicestringNoVoice name, for example 'Aoede' or 'Charon'
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage played when the model call fails
maxHistorynumberNoMaximum conversation history length
predefinedToolsstring[]NoPredefined tools, for example ['_publish_message']
inputModalitiesstring[]NoInput modalities
outputModalitiesstring[]NoOutput modalities
messagesRecord<string, unknown>[]NoConversation messages for short-term memory
additionalParamsRecord<string, unknown>NoAdditional parameters
turnDetectionMllmTurnDetectionConfigNoMLLM turn detection configuration; overrides top-level turnDetection

Avatar vendors

Use with withAvatar(). Each avatar vendor requires a specific TTS sample rate enforced at compile time and runtime.

HeyGenAvatar

Requires TTS at 24,000 Hz.


_1
new HeyGenAvatar(options: HeyGenAvatarOptions)

OptionTypeRequiredDescription
apiKeystringYesHeyGen API key
quality'low' | 'medium' | 'high'YesVideo quality: 360p, 480p, or 720p
agoraUidstringYesRTC UID for the avatar stream
agoraTokenstringNoRTC token for avatar authentication
avatarIdstringNoHeyGen avatar ID
disableIdleTimeoutbooleanNoDisable idle timeout. Default: false
activityIdleTimeoutnumberNoIdle timeout in seconds. Default: 120
enablebooleanNoEnable or disable the avatar. Default: true

AkoolAvatar

Requires TTS at 16,000 Hz.


_1
new AkoolAvatar(options: AkoolAvatarOptions)

OptionTypeRequiredDescription
apiKeystringYesAkool API key
avatarIdstringNoAkool avatar ID
enablebooleanNoEnable or disable the avatar. Default: true

Token utilities

Helper functions and classes for generating and managing tokens. Use these when you need control over token lifetime, or when generating tokens outside of a session.


_1
import { generateConvoAIToken, ExpiresIn } from 'agora-agent-sdk';

generateConvoAIToken(options)

Generates a Conversational AI token combining RTC and RTM privileges. This is the same token the SDK generates automatically in app-credentials mode. Use this when you need a token outside of a session, or when passing a pre-built token to SessionOptions.token.


_1
generateConvoAIToken(options: GenerateConvoAITokenOptions): string

OptionTypeRequiredDescription
appIdstringYesAgora App ID
appCertificatestringYesAgora App Certificate
channelNamestringYesThe channel the token grants access to
accountstringYesThe UID this token is issued for, as a string
tokenExpirenumberNoToken lifetime in seconds. Default: 86400. Valid range: 1–86400

Returns: string — the generated token.


_7
const token = generateConvoAIToken({
_7
appId: 'your-app-id',
_7
appCertificate: 'your-app-certificate',
_7
channelName: 'support-room-123',
_7
account: '1',
_7
tokenExpire: ExpiresIn.hours(12),
_7
});

ExpiresIn

Helper for specifying token lifetimes. Use with SessionOptions.expiresIn or generateConvoAIToken. Values are validated and capped at the Agora maximum of 86400 seconds (24 hours).

Value or methodReturnsDescription
ExpiresIn.DAY8640024 hours — the Agora maximum and default
ExpiresIn.hours(n)numbern hours in seconds. Throws if n ≤ 0, caps at 24 h with a warning
ExpiresIn.minutes(n)numbern minutes in seconds. Throws if n ≤ 0, caps at 24 h with a warning

Types and enums

Shared types and enums used across AgoraClient, Agent, AgentSession, and vendor classes.

Area

Region used for API routing. Pass to AgoraClient via the area option.


_1
import { Area } from 'agora-agent-sdk';

ValueRegion
Area.USUnited States
Area.EUEurope
Area.APAsia-Pacific
Area.CNChina mainland

AgoraAuthMode

The resolved authentication mode on an AgoraClient instance. Read via client.authMode.


_1
type AgoraAuthMode = "app-credentials" | "token" | "basic";

ValueDescription
"app-credentials"App ID and App Certificate provided. SDK auto-generates tokens
"token"Pre-built authToken provided
"basic"customerId and customerSecret provided

AgentSessionEvent

Union type of all valid event names for session.on() and session.off().


_1
type AgentSessionEvent = "started" | "stopped" | "error";

AgentSessionEventHandler

Generic handler type for session event callbacks.


_1
type AgentSessionEventHandler<T> = (data: T) => void;

EventT
"started"{ agentId: string }
"stopped"{ agentId: string }
"error"Error

SpeakPriority

Controls how the agent handles a say() call relative to its current activity.


_1
type SpeakPriority = "INTERRUPT" | "APPEND" | "IGNORE";

ValueDescription
"INTERRUPT"Agent immediately stops current speech and delivers the message
"APPEND"Message is queued and delivered after current speech ends
"IGNORE"Message is discarded if the agent is currently speaking

AgoraError

Thrown when the API returns a 4xx or 5xx response. Catch this to inspect the status code and response body.


_11
import { AgoraError } from 'agora-agent-sdk';
_11
_11
try {
_11
const agentId = await session.start();
_11
} catch (err) {
_11
if (err instanceof AgoraError) {
_11
console.error('Status:', err.statusCode);
_11
console.error('Message:', err.message);
_11
console.error('Body:', err.body);
_11
}
_11
}

PropertyTypeDescription
statusCodenumberHTTP status code returned by the API
messagestringHuman-readable error message
bodyunknownRaw response body from the API
rawResponseResponseThe full HTTP response object