Skip to main content

Typescript SDK API reference

Full API reference for the Agora Conversational AI TypeScript SDK.

AgoraClient

AgoraClient extends the Fern-generated base client with regional domain pool support and three authentication modes.


_1
import { AgoraClient, Area } from 'agora-agent-sdk';

Constructor


_1
new AgoraClient(options: AgoraClient.Options)

The authentication mode is resolved automatically from the options you provide.

OptionTypeRequiredDescription
areaAreaYesRegion for API routing
appIdstringYesAgora App ID
appCertificatestringYesAgora App Certificate. Keep this secret and never expose it client-side
customerIdstringNoCustomer ID for Basic Auth
customerSecretstringNoCustomer Secret for Basic Auth
authTokenstringNoPre-built agora token=<value> string
timeoutnumberNoRequest timeout in milliseconds
maxRetriesnumberNoMaximum retry attempts
fetchtypeof fetchNoCustom fetch implementation for unsupported runtimes

Authentication mode is resolved from the options you provide:

Options providedResolved authMode
customerId + customerSecret"basic"
authToken"token"
Neither"app-credentials"

See Authentication for details on each mode.

Properties

The following read-only properties are available on any AgoraClient instance.

PropertyTypeDescription
appIdstringThe Agora App ID
appCertificatestringThe Agora App Certificate
authModeAgoraAuthModeThe resolved authentication mode
poolPoolThe underlying domain pool instance used for regional routing

Methods

The following methods are available in addition to the Fern-generated sub-client methods.

stopAgent(agentSessionId)

Stops a running agent session by ID without requiring a reference to the AgentSession object. Use this in stateless server architectures where the stop handler runs in a different process from the one that called start().


_1
await client.stopAgent(agentSessionId);

ParameterTypeDescription
agentSessionIdstringThe agent ID returned by session.start()

nextRegion()

Cycles to the next region prefix in the domain pool. Call this after a request failure to try a different regional endpoint.


_1
client.nextRegion();

selectBestDomain(signal?)

Triggers a manual DNS resolution check to select the best domain suffix. Runs automatically every 30 seconds, but can be called manually after a network change.


_1
await client.selectBestDomain();

ParameterTypeDescription
signalAbortSignalOptional abort signal to cancel the DNS check

getCurrentURL()

Returns the full API URL currently in use.


_2
const url = client.getCurrentURL();
_2
// Example: 'https://api-us-west-1.agora.io/api/conversational-ai-agent'

Sub-clients

AgoraClient exposes Fern-generated sub-clients for direct REST API access. You typically do not need these when using the agentkit layer.

PropertyDescription
client.agentsStart, stop, update, speak, interrupt, get, list agents
client.telephonyTelephony operations
client.phoneNumbersPhone number management

For full method signatures and request parameters, see the REST API reference.

Agent

Agent is an immutable configuration object. Each builder method returns a new Agent instance — the original is never modified. Define one Agent at startup and call createSession() on it for each user conversation.


_1
import { Agent } from 'agora-agent-sdk';

Constructor


_1
new Agent(options?: AgentOptions)

All options are optional. Use the builder methods to set vendor configuration after construction.

OptionTypeDefaultDescription
namestringundefinedAgent name, used as the default session name
instructionsstringundefinedLLM system prompt
greetingstringundefinedFirst message spoken when the session starts
failureMessagestringundefinedMessage spoken when an LLM call fails
maxHistorynumberundefinedMaximum conversation turns kept in LLM context
turnDetectionTurnDetectionConfigundefinedVoice activity detection settings
salSalConfigundefinedSelective Attention Locking configuration
avatarAvatarConfigundefinedAvatar configuration
advancedFeaturesAdvancedFeaturesundefinedEnable MLLM mode, AI-VAD, and other advanced features
parametersSessionParamsundefinedSession parameters including silence and farewell config
geofenceGeofenceConfigundefinedRegional access restriction
labelsLabelsundefinedCustom key-value labels returned in notification callbacks
rtcRtcConfigundefinedRTC media encryption
fillerWordsFillerWordsConfigundefinedFiller words played while waiting for the LLM response

Builder methods

All builder methods return a new Agent instance. The original is never modified.

withLlm(vendor)

Sets the LLM vendor. Pass an instance of OpenAI, AzureOpenAI, Anthropic, or Gemini.


_1
withLlm(vendor: BaseLLM): Agent<TTSSampleRate>

withTts(vendor)

Sets the TTS vendor. The sample rate type is captured and tracked for avatar compatibility.


_1
withTts<SR extends number>(vendor: BaseTTS<SR>): Agent<SR>

withStt(vendor)

Sets the STT vendor. Pass an instance of any STT vendor class.


_1
withStt(vendor: BaseSTT): Agent<TTSSampleRate>

withMllm(vendor)

Sets the MLLM vendor for multimodal mode. Pass OpenAIRealtime or VertexAI. Requires advancedFeatures: { enable_mllm: true } in the constructor.


_1
withMllm(vendor: BaseMLLM): Agent<TTSSampleRate>

withAvatar(vendor)

Sets the avatar vendor. The this constraint enforces at compile time that the agent's TTS sample rate matches the avatar's required rate.


_4
withAvatar<RequiredSR extends number>(
_4
this: Agent<RequiredSR>,
_4
vendor: BaseAvatar<RequiredSR>
_4
): Agent<RequiredSR>

withTurnDetection(config)

Configures voice activity detection. Use config.start_of_speech and config.end_of_speech for the SOS/EOS model.


_1
withTurnDetection(config: TurnDetectionConfig): Agent<TTSSampleRate>

withInstructions(text)

Overrides the LLM system prompt on a new Agent instance.


_1
withInstructions(instructions: string): Agent<TTSSampleRate>

withGreeting(text)

Overrides the greeting message on a new Agent instance.


_1
withGreeting(greeting: string): Agent<TTSSampleRate>

withName(name)

Overrides the agent name on a new Agent instance.


_1
withName(name: string): Agent<TTSSampleRate>

Other builder methods

The following methods follow the same pattern — each returns a new Agent instance with the updated configuration.

MethodParameter typeDescription
withSal(config)SalConfigSet Selective Attention Locking configuration
withAdvancedFeatures(features)AdvancedFeaturesSet advanced features
withParameters(parameters)SessionParamsSet session parameters
withFailureMessage(message)stringSet the message spoken when the LLM fails
withMaxHistory(n)numberSet the maximum conversation history length
withGeofence(geofence)GeofenceConfigSet geofence configuration
withLabels(labels)LabelsSet custom labels
withRtc(rtc)RtcConfigSet RTC configuration
withFillerWords(fillerWords)FillerWordsConfigSet filler words configuration

createSession(client, options)

Creates an AgentSession bound to a specific client and channel. Does not start the agent — call session.start() to join the channel.


_4
createSession(
_4
client: AgoraClient,
_4
options: SessionOptions,
_4
): AgentSession

SessionOptions fields:

OptionTypeRequiredDescription
channelstringYesChannel name to join
agentUidstringYesThe agent's RTC UID
remoteUidsstring[]YesRemote user UIDs the agent listens and responds to
namestringNoSession name. Defaults to agent name or agent-{timestamp}
tokenstringNoPre-built RTC+RTM token. Omit to auto-generate from app credentials
expiresInnumberNoToken lifetime in seconds. Only applies when the token is auto-generated. Valid range: 1–86400. Use ExpiresIn helpers for clarity
idleTimeoutnumberNoSeconds before the agent auto-exits when no audio is detected. 0 disables the timeout
enableStringUidbooleanNoUse string UIDs instead of numeric UIDs
presetstringAgentPreset[]No
pipelineIdstringNoPublished AI Studio pipeline ID to use as the base configuration
debugbooleanNoLog API requests to the console

Properties

Read-only properties available on any Agent instance.

PropertyTypeDescription
namestring | undefinedAgent name
instructionsstring | undefinedLLM system prompt
greetingstring | undefinedGreeting message
failureMessagestring | undefinedMessage spoken when LLM fails
maxHistorynumber | undefinedMaximum conversation history length
llmLlmConfig | undefinedLLM configuration
ttsTtsConfig | undefinedTTS configuration
sttSttConfig | undefinedSTT configuration
mllmMllmConfig | undefinedMLLM configuration
avatarAvatarConfig | undefinedAvatar configuration
turnDetectionTurnDetectionConfig | undefinedTurn detection configuration
salSalConfig | undefinedSAL configuration
advancedFeaturesAdvancedFeatures | undefinedAdvanced features
parametersSessionParams | undefinedSession parameters
geofenceGeofenceConfig | undefinedGeofence configuration
labelsLabels | undefinedCustom labels
rtcRtcConfig | undefinedRTC configuration
fillerWordsFillerWordsConfig | undefinedFiller words configuration
configAgentOptionsFull read-only configuration snapshot

AgentSession

AgentSession manages the full lifecycle of a running agent. Obtain an AgentSession by calling agent.createSession() — do not call the constructor directly.


_1
import { AgentSession } from 'agora-agent-sdk';

State machine

A session progresses through the following states:


_4
idle ──► starting ──► running ──► stopping ──► stopped
_4
_4
_4
error

TransitionTrigger
idle → startingstart() called
starting → runningAPI responds with agent ID
starting → errorAPI request fails
running → stoppingstop() called
stopping → stoppedAPI confirms agent stopped
stopping → errorStop request fails and agent was not already stopped
running → errorUnrecoverable error during interaction

start() can also be called from stopped or error state to restart the session.

Methods

The following methods are available on an AgentSession instance.

start()

Starts the agent session. Generates tokens if not provided, sends the start request, and returns the agent ID.


_1
start(): Promise<string>

  • Transitions: idle / stopped / errorstartingrunning
  • Throws if called in starting, running, or stopping state
  • Throws if avatar configuration has a TTS sample rate mismatch

stop()

Stops the agent session and removes the agent from the channel. If the agent has already stopped — for example due to idle timeout — resolves silently rather than throwing a 404 error.


_1
stop(): Promise<void>

  • Transitions: runningstoppingstopped
  • Throws if called outside running state

say(text, options?)

Instructs the agent to speak the given text.


_1
say(text: string, options?: SayOptions): Promise<void>

ParameterTypeRequiredDescription
textstringYesThe text for the agent to speak
options.prioritySpeakPriorityNoMessage priority
options.interruptablebooleanNoWhether this message can be interrupted by the user
  • Only valid in running state

interrupt()

Interrupts the agent's current speech.


_1
interrupt(): Promise<void>

  • Only valid in running state

update(config)

Updates the agent configuration mid-session without restarting. Accepts a partial configuration object in REST API format.


_1
update(config: AgentConfigUpdate): Promise<void>

  • Only valid in running state

getHistory()

Fetches the conversation history for this session. Requires a valid agent ID — start() must have been called successfully.


_1
getHistory(): Promise<ConversationHistory>

getInfo()

Fetches current agent metadata from the API. Requires a valid agent ID.


_1
getInfo(): Promise<SessionInfo>

on(event, handler)

Subscribes to a session event. Register handlers before calling start() to avoid missing the started event.


_1
on<T>(event: AgentSessionEvent, handler: AgentSessionEventHandler<T>): void

off(event, handler)

Unsubscribes a previously registered event handler.


_1
off<T>(event: AgentSessionEvent, handler: AgentSessionEventHandler<T>): void

Events

The session emits the following events. See AgentSessionEvent and AgentSessionEventHandler for type details.

EventPayload typeDescription
"started"{ agentId: string }Agent successfully joined the channel
"stopped"{ agentId: string }Agent left the channel
"error"ErrorAn unrecoverable error occurred

Properties

The following read-only properties are available on any AgentSession instance.

PropertyTypeDescription
statusstringCurrent session state. One of "idle", "starting", "running", "stopping", "stopped", "error"
idstring | nullAgent ID, populated after start() resolves
agentAgentThe agent configuration this session was created from
appIdstringThe Agora App ID for this session
rawAgentsClientDirect access to the Fern-generated AgentsClient for advanced operations

Using session.raw

Use session.raw to call REST API endpoints not yet exposed by the agentkit layer. You must pass appid and agentId manually.


_4
await session.raw.someNewEndpoint({
_4
appid: session.appId,
_4
agentId: session.id!,
_4
});

Vendors

All vendor classes are imported from agora-agent-sdk. Pass vendor instances to the Agent builder methods.

LLM vendors

Use with withLlm().

OpenAI


_1
new OpenAI(options: OpenAIOptions)

OptionTypeRequiredDescription
apiKeystringYesOpenAI API key
modelstringYesModel name, for example 'gpt-4o-mini'
urlstringNoAPI endpoint URL. Default: https://api.openai.com/v1/chat/completions
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
paramsRecord<string, unknown>NoAdditional LLM parameters passed to the model

AzureOpenAI


_1
new AzureOpenAI(options: AzureOpenAIOptions)

OptionTypeRequiredDescription
apiKeystringYesAzure OpenAI API key
modelstringYesModel or deployment name
resourceNamestringYesAzure resource name
deploymentNamestringYesDeployment name in Azure
apiVersionstringNoAzure API version. Default: '2023-05-15'
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
paramsRecord<string, unknown>NoAdditional LLM parameters

Anthropic


_1
new Anthropic(options: AnthropicOptions)

OptionTypeRequiredDescription
apiKeystringYesAnthropic API key
modelstringYesModel name, for example 'claude-3-5-sonnet-20241022'
urlstringNoAPI endpoint URL. Default: https://api.anthropic.com/v1/messages
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
paramsRecord<string, unknown>NoAdditional LLM parameters

Gemini


_1
new Gemini(options: GeminiOptions)

OptionTypeRequiredDescription
apiKeystringYesGoogle API key
modelstringYesModel name, for example 'gemini-pro'
urlstringNoAPI endpoint URL. Default: https://generativelanguage.googleapis.com/v1beta/models
maxHistorynumberNoMaximum conversation history to cache
systemMessagesRecord<string, unknown>[]NoAdditional system messages
greetingMessagestringNoAgent greeting message
failureMessagestringNoMessage spoken when the LLM call fails
inputModalitiesstring[]NoInput modalities. Default: ["text"]
paramsRecord<string, unknown>NoAdditional LLM parameters

TTS vendors

Use with withTts(). The sampleRate option determines avatar compatibility — see withAvatar().

ElevenLabsTTS


_1
new ElevenLabsTTS<SR extends ElevenLabsSampleRate>(options: ElevenLabsTTSOptions<SR>)

OptionTypeRequiredDescription
keystringYesElevenLabs API key
modelIdstringYesModel ID, for example 'eleven_flash_v2_5'
voiceIdstringYesVoice ID
sampleRate16000 | 22050 | 24000 | 44100NoAudio sample rate in Hz
baseUrlstringNoWebSocket base URL
skipPatternsnumber[]NoSkip patterns for bracketed content

MicrosoftTTS


_1
new MicrosoftTTS<SR extends MicrosoftSampleRate>(options: MicrosoftTTSOptions<SR>)

OptionTypeRequiredDescription
keystringYesAzure Speech API key
regionstringYesAzure region, for example 'eastus'
voiceNamestringYesVoice name, for example 'en-US-JennyNeural'
sampleRate16000 | 24000 | 48000NoAudio sample rate in Hz
skipPatternsnumber[]NoSkip patterns for bracketed content

OpenAITTS

Fixed at 24,000 Hz — no configurable sample rate.


_1
new OpenAITTS(options: OpenAITTSOptions)

OptionTypeRequiredDescription
keystringYesOpenAI API key
voicestringYesVoice name: 'alloy', 'echo', 'fable', 'onyx', 'nova', or 'shimmer'
modelstringNoModel name, for example 'tts-1' or 'tts-1-hd'
skipPatternsnumber[]NoSkip patterns for bracketed content

CartesiaTTS


_1
new CartesiaTTS<SR extends CartesiaSampleRate>(options: CartesiaTTSOptions<SR>)

OptionTypeRequiredDescription
keystringYesCartesia API key
voiceIdstringYesVoice ID
modelIdstringNoModel ID
sampleRate8000 | 16000 | 22050 | 24000 | 44100 | 48000NoAudio sample rate in Hz
skipPatternsnumber[]NoSkip patterns for bracketed content

Other TTS vendors

ClassKey parameters
GoogleTTSkey, voiceName, languageCode?
AmazonTTSaccessKey, secretKey, region, voiceId
HumeAITTSkey, configId?
RimeTTSkey, speaker, modelId?
FishAudioTTSkey, referenceId
MiniMaxTTSkey, groupId, model, voiceId, url
MurfTTSkey, voiceId, style?
SarvamTTSkey, speaker, targetLanguageCode

STT vendors

Use with withStt().

DeepgramSTT


_1
new DeepgramSTT(options?: DeepgramSTTOptions)

All options are optional.

OptionTypeDescription
apiKeystringDeepgram API key
modelstringModel name, for example 'nova-2' or 'enhanced'
languagestringLanguage code, for example 'en-US'
smartFormatbooleanEnable smart formatting
punctuationbooleanEnable punctuation
additionalParamsRecord<string, unknown>Additional vendor parameters

Other STT vendors

ClassKey parameters
SpeechmaticsSTTapiKey, language
MicrosoftSTTkey, region, language?
OpenAISTTapiKey, model?, language?
GoogleSTTapiKey, language?
AmazonSTTaccessKey, secretKey, region, language?
AssemblyAISTTapiKey, language?
AresSTTlanguage?
SarvamSTTapiKey, language

MLLM vendors

Use with withMllm() for multimodal end-to-end audio processing without separate STT or TTS steps. Requires advancedFeatures: { enable_mllm: true } in the Agent constructor.

OpenAIRealtime


_1
new OpenAIRealtime(options: OpenAIRealtimeOptions)

OptionTypeRequiredDescription
apiKeystringYesOpenAI API key
modelstringNoModel name, for example 'gpt-4o-realtime-preview'
urlstringNoWebSocket URL
greetingMessagestringNoAgent greeting message
inputModalitiesstring[]NoInput modalities, for example ['audio']
outputModalitiesstring[]NoOutput modalities, for example ['text', 'audio']
messagesRecord<string, unknown>[]NoConversation messages for short-term memory
paramsRecord<string, unknown>NoAdditional MLLM parameters

VertexAI


_1
new VertexAI(options: VertexAIOptions)

OptionTypeRequiredDescription
modelstringYesModel name, for example 'gemini-live-2.5-flash-preview-native-audio-09-2025'
projectIdstringYesGoogle Cloud project ID
locationstringYesGoogle Cloud location or region
adcCredentialsStringstringYesApplication Default Credentials JSON string
instructionsstringNoSystem instructions for the model
voicestringNoVoice name, for example 'Aoede' or 'Charon'
greetingMessagestringNoAgent greeting message
inputModalitiesstring[]NoInput modalities
outputModalitiesstring[]NoOutput modalities
messagesRecord<string, unknown>[]NoConversation messages for short-term memory
additionalParamsRecord<string, unknown>NoAdditional parameters

Avatar vendors

Use with withAvatar(). Each avatar vendor requires a specific TTS sample rate enforced at compile time and runtime.

HeyGenAvatar

Requires TTS at 24,000 Hz.


_1
new HeyGenAvatar(options: HeyGenAvatarOptions)

OptionTypeRequiredDescription
apiKeystringYesHeyGen API key
quality'low' | 'medium' | 'high'YesVideo quality: 360p, 480p, or 720p
agoraUidstringYesRTC UID for the avatar stream
agoraTokenstringNoRTC token for avatar authentication
avatarIdstringNoHeyGen avatar ID
disableIdleTimeoutbooleanNoDisable idle timeout. Default: false
activityIdleTimeoutnumberNoIdle timeout in seconds. Default: 120
enablebooleanNoEnable or disable the avatar. Default: true

AkoolAvatar

Requires TTS at 16,000 Hz.


_1
new AkoolAvatar(options: AkoolAvatarOptions)

OptionTypeRequiredDescription
apiKeystringYesAkool API key
avatarIdstringNoAkool avatar ID
enablebooleanNoEnable or disable the avatar. Default: true

Token utilities

Helper functions and classes for generating and managing tokens. Use these when you need control over token lifetime, or when generating tokens outside of a session.


_1
import { generateConvoAIToken, ExpiresIn } from 'agora-agent-sdk';

generateConvoAIToken(options)

Generates a Conversational AI token combining RTC and RTM privileges. This is the same token the SDK generates automatically in app-credentials mode. Use this when you need a token outside of a session, or when passing a pre-built token to SessionOptions.token.


_1
generateConvoAIToken(options: GenerateConvoAITokenOptions): string

OptionTypeRequiredDescription
appIdstringYesAgora App ID
appCertificatestringYesAgora App Certificate
channelNamestringYesThe channel the token grants access to
accountstringYesThe UID this token is issued for, as a string
tokenExpirenumberNoToken lifetime in seconds. Default: 86400. Valid range: 1–86400

Returns: string — the generated token.


_7
const token = generateConvoAIToken({
_7
appId: 'your-app-id',
_7
appCertificate: 'your-app-certificate',
_7
channelName: 'support-room-123',
_7
account: '1',
_7
tokenExpire: ExpiresIn.hours(12),
_7
});

ExpiresIn

Helper for specifying token lifetimes. Use with SessionOptions.expiresIn or generateConvoAIToken. Values are validated and capped at the Agora maximum of 86400 seconds (24 hours).

Value or methodReturnsDescription
ExpiresIn.DAY8640024 hours — the Agora maximum and default
ExpiresIn.hours(n)numbern hours in seconds. Throws if n ≤ 0, caps at 24 h with a warning
ExpiresIn.minutes(n)numbern minutes in seconds. Throws if n ≤ 0, caps at 24 h with a warning

Types and enums

Shared types and enums used across AgoraClient, Agent, AgentSession, and vendor classes.

Area

Region used for API routing. Pass to AgoraClient via the area option.


_1
import { Area } from 'agora-agent-sdk';

ValueRegion
Area.USUnited States
Area.EUEurope
Area.APAsia-Pacific
Area.CNChina mainland

AgoraAuthMode

The resolved authentication mode on an AgoraClient instance. Read via client.authMode.


_1
type AgoraAuthMode = "app-credentials" | "token" | "basic";

ValueDescription
"app-credentials"App ID and App Certificate provided. SDK auto-generates tokens
"token"Pre-built authToken provided
"basic"customerId and customerSecret provided

AgentSessionEvent

Union type of all valid event names for session.on() and session.off().


_1
type AgentSessionEvent = "started" | "stopped" | "error";

AgentSessionEventHandler

Generic handler type for session event callbacks.


_1
type AgentSessionEventHandler<T> = (data: T) => void;

EventT
"started"{ agentId: string }
"stopped"{ agentId: string }
"error"Error

SpeakPriority

Controls how the agent handles a say() call relative to its current activity.


_1
type SpeakPriority = "INTERRUPT" | "APPEND" | "IGNORE";

ValueDescription
"INTERRUPT"Agent immediately stops current speech and delivers the message
"APPEND"Message is queued and delivered after current speech ends
"IGNORE"Message is discarded if the agent is currently speaking

AgoraError

Thrown when the API returns a 4xx or 5xx response. Catch this to inspect the status code and response body.


_11
import { AgoraError } from 'agora-agent-sdk';
_11
_11
try {
_11
const agentId = await session.start();
_11
} catch (err) {
_11
if (err instanceof AgoraError) {
_11
console.error('Status:', err.statusCode);
_11
console.error('Message:', err.message);
_11
console.error('Body:', err.body);
_11
}
_11
}

PropertyTypeDescription
statusCodenumberHTTP status code returned by the API
messagestringHuman-readable error message
bodyunknownRaw response body from the API
rawResponseResponseThe full HTTP response object