Typescript SDK API reference
Full API reference for the Agora Conversational AI TypeScript SDK.
AgoraClient
AgoraClient extends the Fern-generated base client with regional domain pool support and three authentication modes.
Constructor
The authentication mode is resolved automatically from the options you provide.
| Option | Type | Required | Description |
|---|---|---|---|
area | Area | Yes | Region for API routing |
appId | string | Yes | Agora App ID |
appCertificate | string | Yes | Agora App Certificate. Keep this secret and never expose it client-side |
customerId | string | No | Customer ID for Basic Auth |
customerSecret | string | No | Customer Secret for Basic Auth |
authToken | string | No | Pre-built agora token=<value> string |
timeout | number | No | Request timeout in milliseconds |
maxRetries | number | No | Maximum retry attempts |
fetch | typeof fetch | No | Custom fetch implementation for unsupported runtimes |
Authentication mode is resolved from the options you provide:
| Options provided | Resolved authMode |
|---|---|
customerId + customerSecret | "basic" |
authToken | "token" |
| Neither | "app-credentials" |
See Authentication for details on each mode.
Properties
The following read-only properties are available on any AgoraClient instance.
| Property | Type | Description |
|---|---|---|
appId | string | The Agora App ID |
appCertificate | string | The Agora App Certificate |
authMode | AgoraAuthMode | The resolved authentication mode |
pool | Pool | The underlying domain pool instance used for regional routing |
Methods
The following methods are available in addition to the Fern-generated sub-client methods.
stopAgent(agentSessionId)
Stops a running agent session by ID without requiring a reference to the AgentSession object. Use this in stateless server architectures where the stop handler runs in a different process from the one that called start().
| Parameter | Type | Description |
|---|---|---|
agentSessionId | string | The agent ID returned by session.start() |
nextRegion()
Cycles to the next region prefix in the domain pool. Call this after a request failure to try a different regional endpoint.
selectBestDomain(signal?)
Triggers a manual DNS resolution check to select the best domain suffix. Runs automatically every 30 seconds, but can be called manually after a network change.
| Parameter | Type | Description |
|---|---|---|
signal | AbortSignal | Optional abort signal to cancel the DNS check |
getCurrentURL()
Returns the full API URL currently in use.
Sub-clients
AgoraClient exposes Fern-generated sub-clients for direct REST API access. You typically do not need these when using the agentkit layer.
| Property | Description |
|---|---|
client.agents | Start, stop, update, speak, interrupt, get, list agents |
client.telephony | Telephony operations |
client.phoneNumbers | Phone number management |
For full method signatures and request parameters, see the REST API reference.
Agent
Agent is an immutable configuration object. Each builder method returns a new Agent instance — the original is never modified. Define one Agent at startup and call createSession() on it for each user conversation.
Constructor
All options are optional. Use the builder methods to set vendor configuration after construction.
| Option | Type | Default | Description |
|---|---|---|---|
name | string | undefined | Agent name, used as the default session name |
instructions | string | undefined | LLM system prompt |
greeting | string | undefined | First message spoken when the session starts |
failureMessage | string | undefined | Message spoken when an LLM call fails |
maxHistory | number | undefined | Maximum conversation turns kept in LLM context |
turnDetection | TurnDetectionConfig | undefined | Voice activity detection settings |
sal | SalConfig | undefined | Selective Attention Locking configuration |
avatar | AvatarConfig | undefined | Avatar configuration |
advancedFeatures | AdvancedFeatures | undefined | Enable MLLM mode, AI-VAD, and other advanced features |
parameters | SessionParams | undefined | Session parameters including silence and farewell config |
geofence | GeofenceConfig | undefined | Regional access restriction |
labels | Labels | undefined | Custom key-value labels returned in notification callbacks |
rtc | RtcConfig | undefined | RTC media encryption |
fillerWords | FillerWordsConfig | undefined | Filler words played while waiting for the LLM response |
Builder methods
All builder methods return a new Agent instance. The original is never modified.
withLlm(vendor)
Sets the LLM vendor. Pass an instance of OpenAI, AzureOpenAI, Anthropic, or Gemini.
withTts(vendor)
Sets the TTS vendor. The sample rate type is captured and tracked for avatar compatibility.
withStt(vendor)
Sets the STT vendor. Pass an instance of any STT vendor class.
withMllm(vendor)
Sets the MLLM vendor for multimodal mode. Pass OpenAIRealtime or VertexAI. Requires advancedFeatures: { enable_mllm: true } in the constructor.
withAvatar(vendor)
Sets the avatar vendor. The this constraint enforces at compile time that the agent's TTS sample rate matches the avatar's required rate.
withTurnDetection(config)
Configures voice activity detection. Use config.start_of_speech and config.end_of_speech for the SOS/EOS model.
withInstructions(text)
Overrides the LLM system prompt on a new Agent instance.
withGreeting(text)
Overrides the greeting message on a new Agent instance.
withName(name)
Overrides the agent name on a new Agent instance.
Other builder methods
The following methods follow the same pattern — each returns a new Agent instance with the updated configuration.
| Method | Parameter type | Description |
|---|---|---|
withSal(config) | SalConfig | Set Selective Attention Locking configuration |
withAdvancedFeatures(features) | AdvancedFeatures | Set advanced features |
withParameters(parameters) | SessionParams | Set session parameters |
withFailureMessage(message) | string | Set the message spoken when the LLM fails |
withMaxHistory(n) | number | Set the maximum conversation history length |
withGeofence(geofence) | GeofenceConfig | Set geofence configuration |
withLabels(labels) | Labels | Set custom labels |
withRtc(rtc) | RtcConfig | Set RTC configuration |
withFillerWords(fillerWords) | FillerWordsConfig | Set filler words configuration |
createSession(client, options)
Creates an AgentSession bound to a specific client and channel. Does not start the agent — call session.start() to join the channel.
SessionOptions fields:
| Option | Type | Required | Description |
|---|---|---|---|
channel | string | Yes | Channel name to join |
agentUid | string | Yes | The agent's RTC UID |
remoteUids | string[] | Yes | Remote user UIDs the agent listens and responds to |
name | string | No | Session name. Defaults to agent name or agent-{timestamp} |
token | string | No | Pre-built RTC+RTM token. Omit to auto-generate from app credentials |
expiresIn | number | No | Token lifetime in seconds. Only applies when the token is auto-generated. Valid range: 1–86400. Use ExpiresIn helpers for clarity |
idleTimeout | number | No | Seconds before the agent auto-exits when no audio is detected. 0 disables the timeout |
enableStringUid | boolean | No | Use string UIDs instead of numeric UIDs |
preset | string | AgentPreset[] | No |
pipelineId | string | No | Published AI Studio pipeline ID to use as the base configuration |
debug | boolean | No | Log API requests to the console |
Properties
Read-only properties available on any Agent instance.
| Property | Type | Description |
|---|---|---|
name | string | undefined | Agent name |
instructions | string | undefined | LLM system prompt |
greeting | string | undefined | Greeting message |
failureMessage | string | undefined | Message spoken when LLM fails |
maxHistory | number | undefined | Maximum conversation history length |
llm | LlmConfig | undefined | LLM configuration |
tts | TtsConfig | undefined | TTS configuration |
stt | SttConfig | undefined | STT configuration |
mllm | MllmConfig | undefined | MLLM configuration |
avatar | AvatarConfig | undefined | Avatar configuration |
turnDetection | TurnDetectionConfig | undefined | Turn detection configuration |
sal | SalConfig | undefined | SAL configuration |
advancedFeatures | AdvancedFeatures | undefined | Advanced features |
parameters | SessionParams | undefined | Session parameters |
geofence | GeofenceConfig | undefined | Geofence configuration |
labels | Labels | undefined | Custom labels |
rtc | RtcConfig | undefined | RTC configuration |
fillerWords | FillerWordsConfig | undefined | Filler words configuration |
config | AgentOptions | Full read-only configuration snapshot |
AgentSession
AgentSession manages the full lifecycle of a running agent. Obtain an AgentSession by calling agent.createSession() — do not call the constructor directly.
State machine
A session progresses through the following states:
| Transition | Trigger |
|---|---|
idle → starting | start() called |
starting → running | API responds with agent ID |
starting → error | API request fails |
running → stopping | stop() called |
stopping → stopped | API confirms agent stopped |
stopping → error | Stop request fails and agent was not already stopped |
running → error | Unrecoverable error during interaction |
start() can also be called from stopped or error state to restart the session.
Methods
The following methods are available on an AgentSession instance.
start()
Starts the agent session. Generates tokens if not provided, sends the start request, and returns the agent ID.
- Transitions:
idle/stopped/error→starting→running - Throws if called in
starting,running, orstoppingstate - Throws if avatar configuration has a TTS sample rate mismatch
stop()
Stops the agent session and removes the agent from the channel. If the agent has already stopped — for example due to idle timeout — resolves silently rather than throwing a 404 error.
- Transitions:
running→stopping→stopped - Throws if called outside
runningstate
say(text, options?)
Instructs the agent to speak the given text.
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | The text for the agent to speak |
options.priority | SpeakPriority | No | Message priority |
options.interruptable | boolean | No | Whether this message can be interrupted by the user |
- Only valid in
runningstate
interrupt()
Interrupts the agent's current speech.
- Only valid in
runningstate
update(config)
Updates the agent configuration mid-session without restarting. Accepts a partial configuration object in REST API format.
- Only valid in
runningstate
getHistory()
Fetches the conversation history for this session. Requires a valid agent ID — start() must have been called successfully.
getInfo()
Fetches current agent metadata from the API. Requires a valid agent ID.
on(event, handler)
Subscribes to a session event. Register handlers before calling start() to avoid missing the started event.
off(event, handler)
Unsubscribes a previously registered event handler.
Events
The session emits the following events. See AgentSessionEvent and AgentSessionEventHandler for type details.
| Event | Payload type | Description |
|---|---|---|
"started" | { agentId: string } | Agent successfully joined the channel |
"stopped" | { agentId: string } | Agent left the channel |
"error" | Error | An unrecoverable error occurred |
Properties
The following read-only properties are available on any AgentSession instance.
| Property | Type | Description |
|---|---|---|
status | string | Current session state. One of "idle", "starting", "running", "stopping", "stopped", "error" |
id | string | null | Agent ID, populated after start() resolves |
agent | Agent | The agent configuration this session was created from |
appId | string | The Agora App ID for this session |
raw | AgentsClient | Direct access to the Fern-generated AgentsClient for advanced operations |
Using session.raw
Use session.raw to call REST API endpoints not yet exposed by the agentkit layer. You must pass appid and agentId manually.
Vendors
All vendor classes are imported from agora-agent-sdk. Pass vendor instances to the Agent builder methods.
LLM vendors
Use with withLlm().
OpenAI
| Option | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | OpenAI API key |
model | string | Yes | Model name, for example 'gpt-4o-mini' |
url | string | No | API endpoint URL. Default: https://api.openai.com/v1/chat/completions |
maxHistory | number | No | Maximum conversation history to cache |
systemMessages | Record<string, unknown>[] | No | Additional system messages |
greetingMessage | string | No | Agent greeting message |
failureMessage | string | No | Message spoken when the LLM call fails |
inputModalities | string[] | No | Input modalities. Default: ["text"] |
params | Record<string, unknown> | No | Additional LLM parameters passed to the model |
AzureOpenAI
| Option | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | Azure OpenAI API key |
model | string | Yes | Model or deployment name |
resourceName | string | Yes | Azure resource name |
deploymentName | string | Yes | Deployment name in Azure |
apiVersion | string | No | Azure API version. Default: '2023-05-15' |
maxHistory | number | No | Maximum conversation history to cache |
systemMessages | Record<string, unknown>[] | No | Additional system messages |
greetingMessage | string | No | Agent greeting message |
failureMessage | string | No | Message spoken when the LLM call fails |
inputModalities | string[] | No | Input modalities. Default: ["text"] |
params | Record<string, unknown> | No | Additional LLM parameters |
Anthropic
| Option | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | Anthropic API key |
model | string | Yes | Model name, for example 'claude-3-5-sonnet-20241022' |
url | string | No | API endpoint URL. Default: https://api.anthropic.com/v1/messages |
maxHistory | number | No | Maximum conversation history to cache |
systemMessages | Record<string, unknown>[] | No | Additional system messages |
greetingMessage | string | No | Agent greeting message |
failureMessage | string | No | Message spoken when the LLM call fails |
inputModalities | string[] | No | Input modalities. Default: ["text"] |
params | Record<string, unknown> | No | Additional LLM parameters |
Gemini
| Option | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | Google API key |
model | string | Yes | Model name, for example 'gemini-pro' |
url | string | No | API endpoint URL. Default: https://generativelanguage.googleapis.com/v1beta/models |
maxHistory | number | No | Maximum conversation history to cache |
systemMessages | Record<string, unknown>[] | No | Additional system messages |
greetingMessage | string | No | Agent greeting message |
failureMessage | string | No | Message spoken when the LLM call fails |
inputModalities | string[] | No | Input modalities. Default: ["text"] |
params | Record<string, unknown> | No | Additional LLM parameters |
TTS vendors
Use with withTts(). The sampleRate option determines avatar compatibility — see withAvatar().
ElevenLabsTTS
| Option | Type | Required | Description |
|---|---|---|---|
key | string | Yes | ElevenLabs API key |
modelId | string | Yes | Model ID, for example 'eleven_flash_v2_5' |
voiceId | string | Yes | Voice ID |
sampleRate | 16000 | 22050 | 24000 | 44100 | No | Audio sample rate in Hz |
baseUrl | string | No | WebSocket base URL |
skipPatterns | number[] | No | Skip patterns for bracketed content |
MicrosoftTTS
| Option | Type | Required | Description |
|---|---|---|---|
key | string | Yes | Azure Speech API key |
region | string | Yes | Azure region, for example 'eastus' |
voiceName | string | Yes | Voice name, for example 'en-US-JennyNeural' |
sampleRate | 16000 | 24000 | 48000 | No | Audio sample rate in Hz |
skipPatterns | number[] | No | Skip patterns for bracketed content |
OpenAITTS
Fixed at 24,000 Hz — no configurable sample rate.
| Option | Type | Required | Description |
|---|---|---|---|
key | string | Yes | OpenAI API key |
voice | string | Yes | Voice name: 'alloy', 'echo', 'fable', 'onyx', 'nova', or 'shimmer' |
model | string | No | Model name, for example 'tts-1' or 'tts-1-hd' |
skipPatterns | number[] | No | Skip patterns for bracketed content |
CartesiaTTS
| Option | Type | Required | Description |
|---|---|---|---|
key | string | Yes | Cartesia API key |
voiceId | string | Yes | Voice ID |
modelId | string | No | Model ID |
sampleRate | 8000 | 16000 | 22050 | 24000 | 44100 | 48000 | No | Audio sample rate in Hz |
skipPatterns | number[] | No | Skip patterns for bracketed content |
Other TTS vendors
| Class | Key parameters |
|---|---|
GoogleTTS | key, voiceName, languageCode? |
AmazonTTS | accessKey, secretKey, region, voiceId |
HumeAITTS | key, configId? |
RimeTTS | key, speaker, modelId? |
FishAudioTTS | key, referenceId |
MiniMaxTTS | key, groupId, model, voiceId, url |
MurfTTS | key, voiceId, style? |
SarvamTTS | key, speaker, targetLanguageCode |
STT vendors
Use with withStt().
DeepgramSTT
All options are optional.
| Option | Type | Description |
|---|---|---|
apiKey | string | Deepgram API key |
model | string | Model name, for example 'nova-2' or 'enhanced' |
language | string | Language code, for example 'en-US' |
smartFormat | boolean | Enable smart formatting |
punctuation | boolean | Enable punctuation |
additionalParams | Record<string, unknown> | Additional vendor parameters |
Other STT vendors
| Class | Key parameters |
|---|---|
SpeechmaticsSTT | apiKey, language |
MicrosoftSTT | key, region, language? |
OpenAISTT | apiKey, model?, language? |
GoogleSTT | apiKey, language? |
AmazonSTT | accessKey, secretKey, region, language? |
AssemblyAISTT | apiKey, language? |
AresSTT | language? |
SarvamSTT | apiKey, language |
MLLM vendors
Use with withMllm() for multimodal end-to-end audio processing without separate STT or TTS steps. Requires advancedFeatures: { enable_mllm: true } in the Agent constructor.
OpenAIRealtime
| Option | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | OpenAI API key |
model | string | No | Model name, for example 'gpt-4o-realtime-preview' |
url | string | No | WebSocket URL |
greetingMessage | string | No | Agent greeting message |
inputModalities | string[] | No | Input modalities, for example ['audio'] |
outputModalities | string[] | No | Output modalities, for example ['text', 'audio'] |
messages | Record<string, unknown>[] | No | Conversation messages for short-term memory |
params | Record<string, unknown> | No | Additional MLLM parameters |
VertexAI
| Option | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model name, for example 'gemini-live-2.5-flash-preview-native-audio-09-2025' |
projectId | string | Yes | Google Cloud project ID |
location | string | Yes | Google Cloud location or region |
adcCredentialsString | string | Yes | Application Default Credentials JSON string |
instructions | string | No | System instructions for the model |
voice | string | No | Voice name, for example 'Aoede' or 'Charon' |
greetingMessage | string | No | Agent greeting message |
inputModalities | string[] | No | Input modalities |
outputModalities | string[] | No | Output modalities |
messages | Record<string, unknown>[] | No | Conversation messages for short-term memory |
additionalParams | Record<string, unknown> | No | Additional parameters |
Avatar vendors
Use with withAvatar(). Each avatar vendor requires a specific TTS sample rate enforced at compile time and runtime.
HeyGenAvatar
Requires TTS at 24,000 Hz.
| Option | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | HeyGen API key |
quality | 'low' | 'medium' | 'high' | Yes | Video quality: 360p, 480p, or 720p |
agoraUid | string | Yes | RTC UID for the avatar stream |
agoraToken | string | No | RTC token for avatar authentication |
avatarId | string | No | HeyGen avatar ID |
disableIdleTimeout | boolean | No | Disable idle timeout. Default: false |
activityIdleTimeout | number | No | Idle timeout in seconds. Default: 120 |
enable | boolean | No | Enable or disable the avatar. Default: true |
AkoolAvatar
Requires TTS at 16,000 Hz.
| Option | Type | Required | Description |
|---|---|---|---|
apiKey | string | Yes | Akool API key |
avatarId | string | No | Akool avatar ID |
enable | boolean | No | Enable or disable the avatar. Default: true |
Token utilities
Helper functions and classes for generating and managing tokens. Use these when you need control over token lifetime, or when generating tokens outside of a session.
generateConvoAIToken(options)
Generates a Conversational AI token combining RTC and RTM privileges. This is the same token the SDK generates automatically in app-credentials mode. Use this when you need a token outside of a session, or when passing a pre-built token to SessionOptions.token.
| Option | Type | Required | Description |
|---|---|---|---|
appId | string | Yes | Agora App ID |
appCertificate | string | Yes | Agora App Certificate |
channelName | string | Yes | The channel the token grants access to |
account | string | Yes | The UID this token is issued for, as a string |
tokenExpire | number | No | Token lifetime in seconds. Default: 86400. Valid range: 1–86400 |
Returns: string — the generated token.
ExpiresIn
Helper for specifying token lifetimes. Use with SessionOptions.expiresIn or generateConvoAIToken. Values are validated and capped at the Agora maximum of 86400 seconds (24 hours).
| Value or method | Returns | Description |
|---|---|---|
ExpiresIn.DAY | 86400 | 24 hours — the Agora maximum and default |
ExpiresIn.hours(n) | number | n hours in seconds. Throws if n ≤ 0, caps at 24 h with a warning |
ExpiresIn.minutes(n) | number | n minutes in seconds. Throws if n ≤ 0, caps at 24 h with a warning |
Types and enums
Shared types and enums used across AgoraClient, Agent, AgentSession, and vendor classes.
Area
Region used for API routing. Pass to AgoraClient via the area option.
| Value | Region |
|---|---|
Area.US | United States |
Area.EU | Europe |
Area.AP | Asia-Pacific |
Area.CN | China mainland |
AgoraAuthMode
The resolved authentication mode on an AgoraClient instance. Read via client.authMode.
| Value | Description |
|---|---|
"app-credentials" | App ID and App Certificate provided. SDK auto-generates tokens |
"token" | Pre-built authToken provided |
"basic" | customerId and customerSecret provided |
AgentSessionEvent
Union type of all valid event names for session.on() and session.off().
AgentSessionEventHandler
Generic handler type for session event callbacks.
| Event | T |
|---|---|
"started" | { agentId: string } |
"stopped" | { agentId: string } |
"error" | Error |
SpeakPriority
Controls how the agent handles a say() call relative to its current activity.
| Value | Description |
|---|---|
"INTERRUPT" | Agent immediately stops current speech and delivers the message |
"APPEND" | Message is queued and delivered after current speech ends |
"IGNORE" | Message is discarded if the agent is currently speaking |
AgoraError
Thrown when the API returns a 4xx or 5xx response. Catch this to inspect the status code and response body.
| Property | Type | Description |
|---|---|---|
statusCode | number | HTTP status code returned by the API |
message | string | Human-readable error message |
body | unknown | Raw response body from the API |
rawResponse | Response | The full HTTP response object |