Go SDK API reference
Full API reference for the Agora Conversational AI Go SDK.
client.NewClient
The entry point for the SDK. Creates a new API client with the given request options. All sub-clients share the same configuration.
Constructor
Pass one or more request options to configure authentication, regional routing, and transport behavior.
When using the agentkit layer, your App ID and App Certificate are passed on AgentSessionOptions — not on the client. See agentkit.NewAgentSession.
Request options
Request options can be set at client creation time (applied to all requests) or passed to individual method calls.
option.WithToken
Sets a bearer token for authentication. This is the recommended option for new integrations and matches the onboarding guides.
option.WithBasicAuth
Sets Authorization: Basic <base64>. Use your Agora Customer ID and Customer Secret from Agora Console.
Use this only when you explicitly want Basic Auth. It is still supported by the client, but token auth is the preferred path in the narrative docs and quick starts.
option.WithArea
Enables regional routing with automatic DNS-based domain selection.
option.WithBaseURL
Overrides the default API endpoint. Useful for testing.
option.WithHTTPClient
Provides a custom *http.Client. Recommended for production to set timeouts.
option.WithMaxAttempts
Sets the maximum number of retry attempts. Default: 2. Retries use exponential backoff for status codes 408, 429, and 5xx.
option.WithHTTPHeader
Adds custom HTTP headers to every request.
option.WithBodyProperties
Adds extra properties to the JSON request body.
option.WithQueryParameters
Adds query parameters to the request URL.
option.WithPool
Uses a pre-configured Pool for regional routing.
Sub-clients
client.NewClient exposes Fern-generated sub-clients for direct REST API access. You typically do not need these when using the agentkit layer.
| Field | Type | Description |
|---|---|---|
c.Agents | *agents.Client | Agent lifecycle (start, stop, speak, interrupt, update, get, getHistory) |
c.Telephony | *telephony.Client | Telephony operations (call, hangup) |
c.PhoneNumbers | *phonenumbers.Client | Phone number management |
All sub-client methods take context.Context as their first argument. See the generated reference for full method signatures.
Environments
The root Agora package exposes the default API endpoint:
Pointer helpers
The root Agora package provides helper functions for creating pointers to literal values. These are required for optional fields in Fern-generated request structs, which use pointer types to distinguish between "not set" and "set to zero value".
| Function | Signature | Example |
|---|---|---|
Agora.Bool | func(bool) *bool | Enable: Agora.Bool(true) |
Agora.Int | func(int) *int | IdleTimeout: Agora.Int(120) |
Agora.String | func(string) *string | APIKey: Agora.String("<key>") |
Agora.Float64 | func(float64) *float64 | Threshold: Agora.Float64(0.5) |
Agora.Float32 | func(float32) *float32 | — |
Agora.Int8 / Int16 / Int32 / Int64 | func(intN) *intN | — |
Agora.Uint / Uint8 / Uint16 / Uint32 / Uint64 | func(uintN) *uintN | — |
Agora.UUID | func(uuid.UUID) *uuid.UUID | — |
Agora.Time | func(time.Time) *time.Time | — |
agentkit.NewAgent
Agent is an immutable configuration object. Each vendor chaining method returns a new *Agent — the original is never modified. Define one agent at startup and create sessions from it for each user conversation.
Constructor
Pass AgentOption functions to configure the agent's name, instructions, greeting, and other properties.
AgentOption functions
AgentOption functions are passed to NewAgent. Each function has the signature func(*Agent).
| Function | Parameter type | Description |
|---|---|---|
WithName(name) | string | Agent name identifier |
WithInstructions(instructions) | string | LLM system prompt |
WithGreeting(greeting) | string | First message the agent speaks |
WithFailureMessage(msg) | string | Message spoken when the LLM fails |
WithMaxHistory(n) | int | Maximum conversation turns to retain |
WithTurnDetectionConfig(td) | *TurnDetectionConfig | Cascading-flow turn detection configuration. Use Config.StartOfSpeech and Config.EndOfSpeech for SOS/EOS detection. Use interruption config for interruption behavior and MLLM vendor TurnDetection for MLLM turn detection |
WithInterruptionConfig(interruption) | *InterruptionConfig | Unified interruption control using the top-level interruption object |
WithSalConfig(sal) | *SalConfig | Speech analytics configuration |
WithAdvancedFeatures(af) | *AdvancedFeatures | Advanced feature flags, for example EnableMllm, EnableAivad |
WithTools(enabled) | bool | Enable or disable MCP tool invocation |
WithParameters(params) | *SessionParams | Additional session parameters |
WithGeofence(gf) | *GeofenceConfig | Regional access restriction |
WithLabels(labels) | map[string]string | Custom key-value labels returned in notification callbacks |
WithRtc(rtc) | *RtcConfig | RTC media encryption |
WithFillerWords(fw) | *FillerWordsConfig | Filler words played while waiting for the LLM response |
Vendor chaining methods
Vendor methods are called on the *Agent returned by NewAgent. Each method returns a new *Agent — the original is never modified.
WithLlm(vendor)
Sets the LLM vendor. Pass an instance of NewOpenAI, NewAzureOpenAI, NewAnthropic, or NewGemini.
WithTts(vendor)
Sets the TTS vendor. Captures the vendor's sample rate for avatar validation.
WithStt(vendor)
Sets the STT vendor. Pass an instance of any STT vendor constructor.
WithMllm(vendor)
Sets the MLLM vendor for multimodal mode. Pass NewOpenAIRealtime, NewGeminiLive, or NewVertexAI. Requires AdvancedFeatures.EnableMllm = true.
WithAvatar(vendor)
Sets the avatar vendor. Panics if TTS is already configured with a sample rate that does not match the avatar's required rate.
WithTurnDetection(config)
Configures cascading-flow turn detection. Use Config.StartOfSpeech and Config.EndOfSpeech for SOS/EOS detection. Use interruption config for interruption behavior and MLLM vendor TurnDetection for MLLM turn detection.
Other builder methods
The following methods follow the same pattern — each returns a new *Agent with the updated configuration.
| Method | Parameter type | Description |
|---|---|---|
WithInstructions(instructions) | string | Override the LLM system prompt |
WithGreeting(greeting) | string | Override the greeting message |
WithName(name) | string | Override the agent name |
WithSal(sal) | *SalConfig | Set SAL configuration |
WithAdvancedFeatures(af) | *AdvancedFeatures | Set advanced features |
WithTools(enabled) | bool | Enable or disable MCP tool invocation |
WithParameters(params) | *SessionParams | Set session parameters |
WithFailureMessage(msg) | string | Set the failure message |
WithMaxHistory(n) | int | Set the maximum conversation history length |
WithGeofence(gf) | *GeofenceConfig | Set geofence configuration |
WithLabels(labels) | map[string]string | Set custom labels |
WithRtc(rtc) | *RtcConfig | Set RTC configuration |
WithFillerWords(fw) | *FillerWordsConfig | Set filler words configuration |
ToProperties()
Converts the agent configuration to a *Agora.StartAgentsRequestProperties for direct use with the low-level client. Called internally by AgentSession.Start(). Use this directly when building custom request bodies.
Returns an error if:
- Neither
TokennorAppID+AppCertificateis provided - In cascading mode: LLM or TTS is not configured
- Config marshaling fails
ToPropertiesOptions
| Field | Type | Required | Description |
|---|---|---|---|
Channel | string | Yes | Agora channel name |
AgentUID | string | Yes | Agent's UID in the channel |
RemoteUIDs | []string | Yes | Remote participant UIDs |
Token | string | Conditional | Pre-generated RTC+RTM token. Skips generation if set |
AppID | string | Conditional | Agora App ID. Required if Token is not set |
AppCertificate | string | Conditional | Agora App Certificate. Required if Token is not set |
ExpiresIn | int | No | Token lifetime in seconds. Default: 86400. Valid range: 1–86400 |
IdleTimeout | *int | No | Session idle timeout in seconds |
EnableStringUID | *bool | No | Enable string UID mode |
Getters
Read-only methods available on any *Agent instance.
| Method | Return type | Description |
|---|---|---|
Name() | string | Agent name |
Instructions() | string | LLM system prompt |
Greeting() | string | Greeting message |
FailureMessage() | string | Message spoken when LLM fails |
MaxHistory() | *int | Maximum conversation history length |
LlmConfig() | map[string]interface{} | LLM configuration |
TtsConfig() | map[string]interface{} | TTS configuration |
SttConfig() | map[string]interface{} | STT configuration |
MllmConfig() | map[string]interface{} | MLLM configuration |
TtsSampleRate() | *vendors.SampleRate | TTS sample rate |
AvatarRequiredSampleRate() | *vendors.SampleRate | Avatar required sample rate |
Avatar() | map[string]interface{} | Avatar configuration |
TurnDetection() | *TurnDetectionConfig | Turn detection configuration |
Sal() | *SalConfig | SAL configuration |
AdvancedFeatures() | *AdvancedFeatures | Advanced features |
Parameters() | *SessionParams | Session parameters |
Geofence() | *GeofenceConfig | Geofence configuration |
Labels() | map[string]string | Custom labels |
Rtc() | *RtcConfig | RTC configuration |
FillerWords() | *FillerWordsConfig | Filler words configuration |
agentkit.NewAgentSession
AgentSession manages the full lifecycle of a running agent. Create a session with NewAgentSession and call Start() to join the agent to the channel.
Constructor
If Name is empty, defaults to agent-<unix_timestamp>. The session starts in StatusIdle.
AgentSessionOptions
| Field | Type | Required | Description |
|---|---|---|---|
Client | *agents.Client | Yes | Fern-generated agents sub-client. Pass c.Agents from client.NewClient |
Agent | *Agent | Yes | Agent configuration built with NewAgent |
AppID | string | Yes | Agora App ID |
AppCertificate | string | Conditional | Required if Token is not set |
Name | string | No | Session name. Default: agent-<unix_timestamp> |
Channel | string | Yes | Agora channel name |
Token | string | Conditional | Pre-generated RTC+RTM token. Skips auto-generation if set |
AgentUID | string | Yes | Agent's UID in the channel |
RemoteUIDs | []string | Yes | Remote participant UIDs |
IdleTimeout | *int | No | Idle timeout in seconds |
EnableStringUID | *bool | No | Enable string UID mode |
ExpiresIn | int | No | Auto-generated token lifetime in seconds |
UseAppCredentialsForREST | bool | No | Generate ConvoAI REST auth headers per request |
Preset | []string | No | Preset IDs to send on session start |
PipelineID | string | No | Published pipeline ID to send on session start |
Debug | bool | No | Enable debug logging of the start request |
Warn | func(string) | No | Custom warning sink; defaults to logger |
State machine
A session progresses through the following states:
| Transition | Trigger |
|---|---|
idle → starting | Start() called |
starting → running | API responds with agent ID |
starting → error | API request fails |
running → stopping | Stop() called |
stopping → stopped | API confirms agent stopped |
stopping → error | Stop request fails and agent was not already stopped |
running → error | Unrecoverable error during interaction |
Start() can also be called from stopped or error state to restart the session.
Methods
All methods take context.Context as the first argument. Register event handlers before calling Start() to avoid missing the started event.
Start(ctx)
Starts the agent session. Validates avatar/TTS configuration, generates a token if not provided, and calls the Agora API. Returns the agent ID.
- Valid from:
idle,stopped,error - Transitions to:
starting→runningon success,erroron failure - Emits:
"started"on success,"error"on failure - Validates avatar config and avatar/TTS sample rate match before making the API call
Stop(ctx)
Stops the running agent and removes it from the channel.
- Valid from:
running - Transitions to:
stopping→stoppedon success,erroron failure - Emits:
"stopped"on success,"error"on failure
Say(ctx, text, priority, interruptable)
Instructs the agent to speak the given text.
- Valid from:
running - Pass
nilforpriorityorinterruptableto use defaults
| Parameter | Type | Description |
|---|---|---|
text | string | The text for the agent to speak |
priority | *Agora.SpeakAgentsRequestPriority | Message priority. See SpeakPriority. Pass nil for default |
interruptable | *bool | Whether this message can be interrupted. Pass nil for default |
Interrupt(ctx)
Interrupts the agent's current speech.
- Valid from:
running
Update(ctx, properties)
Updates the agent's properties mid-session without restarting. Accepts a typed properties struct in REST API format.
- Valid from:
running
GetHistory(ctx)
Retrieves the conversation history. Requires a valid agent ID — Start() must have been called successfully.
GetTurns(ctx)
Retrieves turn-by-turn analytics for the session. Requires a valid agent ID — Start() must have been called successfully.
GetInfo(ctx)
Gets the current agent status from the API. Requires a valid agent ID.
On(event, handler)
Registers an event handler. Multiple handlers can be registered for the same event. Handlers run synchronously; panics in handlers are recovered and reported through the session warning sink.
Off(event, handler)
Unregisters a previously registered event handler.
Events
| Event | Data type | Description |
|---|---|---|
"started" | map[string]string{"agent_id": "..."} | Agent successfully joined the channel |
"stopped" | map[string]string{"agent_id": "..."} | Agent left the channel |
"error" | error | An unrecoverable error occurred |
Getters
Read-only methods available on any *AgentSession instance.
| Method | Return type | Description |
|---|---|---|
ID() | string | Agent ID. Empty string before Start() succeeds |
Status() | SessionStatus | Current session state |
Agent() | *Agent | The agent configuration |
AppID() | string | The Agora App ID |
Raw() | *agents.Client | Direct access to the Fern-generated agents client for advanced operations |
Using session.Raw()
Use session.Raw() to call REST API endpoints not yet exposed by the agentkit layer.
Thread safety
AgentSession is safe for concurrent use across goroutines. All state mutations are protected by sync.RWMutex. Multiple goroutines can safely call Status(), ID(), and other getters while another goroutine calls Start() or Stop().
Vendors
All vendor constructors are in the agentkit/vendors package. Constructors panic if required fields are empty — this is Go-idiomatic behavior for programmer configuration errors.
Interfaces
LLM vendors
Use with WithLlm().
NewOpenAI
Panics if APIKey is empty unless Model is one of the supported preset-backed OpenAI models (gpt-4o-mini, gpt-4.1-mini, gpt-5-nano, gpt-5-mini) and BaseURL / Vendor are not set.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
APIKey | string | No | — | OpenAI API key. Optional for supported preset-backed OpenAI models |
Model | string | No | "gpt-4o-mini" | Model identifier |
BaseURL | string | No | "https://api.openai.com/v1/chat/completions" | API endpoint |
Temperature | *float64 | No | — | Sampling temperature |
TopP | *float64 | No | — | Nucleus sampling |
MaxTokens | *int | No | — | Maximum tokens in response |
SystemMessages | []map[string]interface{} | No | — | System messages |
GreetingMessage | string | No | — | Agent greeting message |
FailureMessage | string | No | — | Message spoken when LLM fails |
InputModalities | []string | No | ["text"] | Input modalities |
OutputModalities | []string | No | — | Output modalities |
Params | map[string]interface{} | No | — | Additional model parameters |
Headers | map[string]string | No | — | Custom HTTP headers forwarded to the LLM provider |
GreetingConfigs | map[string]interface{} | No | — | Greeting playback configuration |
TemplateVariables | map[string]string | No | — | Template variables for messages |
NewAzureOpenAI
Panics if APIKey, Endpoint, or DeploymentName is empty.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
APIKey | string | Yes | — | Azure OpenAI API key |
Endpoint | string | Yes | — | Azure endpoint URL |
DeploymentName | string | Yes | — | Azure deployment name |
APIVersion | string | No | "2024-08-01-preview" | API version |
Temperature | *float64 | No | — | Sampling temperature |
TopP | *float64 | No | — | Nucleus sampling |
MaxTokens | *int | No | — | Maximum tokens |
SystemMessages | []map[string]interface{} | No | — | System messages |
GreetingMessage | string | No | — | Agent greeting message |
FailureMessage | string | No | — | Message spoken when LLM fails |
InputModalities | []string | No | ["text"] | Input modalities |
OutputModalities | []string | No | — | Output modalities |
Params | map[string]interface{} | No | — | Additional model parameters |
Headers | map[string]string | No | — | Custom HTTP headers forwarded to the LLM provider |
GreetingConfigs | map[string]interface{} | No | — | Greeting playback configuration |
TemplateVariables | map[string]string | No | — | Template variables for messages |
NewAnthropic
Panics if APIKey is empty.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
APIKey | string | Yes | — | Anthropic API key |
Model | string | No | "claude-3-5-sonnet-20241022" | Model identifier |
MaxTokens | *int | No | — | Maximum tokens |
Temperature | *float64 | No | — | Sampling temperature |
TopP | *float64 | No | — | Nucleus sampling |
SystemMessages | []map[string]interface{} | No | — | System messages |
GreetingMessage | string | No | — | Agent greeting message |
FailureMessage | string | No | — | Message spoken when LLM fails |
InputModalities | []string | No | ["text"] | Input modalities |
OutputModalities | []string | No | — | Output modalities |
Params | map[string]interface{} | No | — | Additional model parameters |
Headers | map[string]string | No | — | Custom HTTP headers forwarded to the LLM provider |
GreetingConfigs | map[string]interface{} | No | — | Greeting playback configuration |
TemplateVariables | map[string]string | No | — | Template variables for messages |
NewGemini
Panics if APIKey is empty.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
APIKey | string | Yes | — | Google AI API key |
Model | string | No | "gemini-2.0-flash-exp" | Model identifier |
Temperature | *float64 | No | — | Sampling temperature |
TopP | *float64 | No | — | Nucleus sampling |
TopK | *int | No | — | Top-K sampling |
MaxOutputTokens | *int | No | — | Maximum output tokens |
SystemMessages | []map[string]interface{} | No | — | System messages |
GreetingMessage | string | No | — | Agent greeting message |
FailureMessage | string | No | — | Message spoken when LLM fails |
InputModalities | []string | No | ["text"] | Input modalities |
OutputModalities | []string | No | — | Output modalities |
Params | map[string]interface{} | No | — | Additional model parameters |
Headers | map[string]string | No | — | Custom HTTP headers forwarded to the LLM provider |
GreetingConfigs | map[string]interface{} | No | — | Greeting playback configuration |
TemplateVariables | map[string]string | No | — | Template variables for messages |
TTS vendors
Use with WithTts(). The SampleRate field determines avatar compatibility — see WithAvatar(). Use SampleRate constants for the SampleRate field.
NewElevenLabsTTS
Panics if Key, ModelID, or VoiceID is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | ElevenLabs API key |
ModelID | string | Yes | Model identifier, for example "eleven_flash_v2_5" |
VoiceID | string | Yes | Voice identifier |
BaseURL | string | No | Custom API endpoint |
SampleRate | *SampleRate | No | Output sample rate |
SkipPatterns | []int | No | Patterns to skip in TTS output |
NewMicrosoftTTS
Panics if Key, Region, or VoiceName is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Azure Speech Services key |
Region | string | Yes | Azure region, for example "eastus" |
VoiceName | string | Yes | Voice name, for example "en-US-JennyNeural" |
SampleRate | *SampleRate | No | Output sample rate |
SkipPatterns | []int | No | Patterns to skip |
NewOpenAITTS
Fixed sample rate: SampleRate24kHz.
Panics if Voice is empty. APIKey is optional for the preset-backed tts-1 path.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | No | OpenAI API key. Optional for the preset-backed tts-1 path |
Voice | string | Yes | Voice name: "alloy", "echo", "fable", "onyx", "nova", or "shimmer" |
Model | string | No | Model identifier, for example "tts-1" or "tts-1-hd" |
ResponseFormat | string | No | Audio format, for example "pcm" |
Speed | *float64 | No | Speech speed multiplier |
SkipPatterns | []int | No | Patterns to skip |
NewCartesiaTTS
Panics if APIKey or VoiceID is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | Cartesia API key |
VoiceID | string | Yes | Voice identifier (serialized as {"mode":"id","id":"..."}) |
ModelID | string | No | Model identifier |
SampleRate | *SampleRate | No | Output sample rate |
SkipPatterns | []int | No | Patterns to skip |
NewGoogleTTS
Panics if Key or VoiceName is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Google Cloud API key |
VoiceName | string | Yes | Voice name |
LanguageCode | string | No | Language code |
SkipPatterns | []int | No | Patterns to skip |
NewAmazonTTS
Panics if AccessKey, SecretKey, Region, or VoiceID is empty.
| Field | Type | Required | Description |
|---|---|---|---|
AccessKey | string | Yes | AWS access key |
SecretKey | string | Yes | AWS secret key |
Region | string | Yes | AWS region |
VoiceID | string | Yes | Amazon Polly voice ID |
SkipPatterns | []int | No | Patterns to skip |
NewDeepgramTTS
Panics if APIKey or Model is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | Deepgram API key |
Model | string | Yes | Deepgram TTS model, for example "aura-2-thalia-en" |
BaseURL | string | No | WebSocket endpoint. Defaults server-side to wss://api.deepgram.com/v1/speak |
SampleRate | *SampleRate | No | Output sample rate |
Params | map[string]interface{} | No | Additional Deepgram TTS parameters |
SkipPatterns | []int | No | Patterns to skip |
NewHumeAITTS
Panics if Key is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Hume AI API key |
ConfigID | string | No | Configuration ID |
SkipPatterns | []int | No | Patterns to skip |
NewRimeTTS
Panics if Key or Speaker is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Rime API key |
Speaker | string | Yes | Speaker identifier |
ModelID | string | No | Model identifier |
Lang | string | No | Language code |
SamplingRate | *int | No | Sampling rate in Hz (serialized as samplingRate) |
SpeedAlpha | *float64 | No | Speed multiplier (serialized as speedAlpha) |
SkipPatterns | []int | No | Patterns to skip |
NewFishAudioTTS
Panics if Key or ReferenceID is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Fish Audio API key |
ReferenceID | string | Yes | Reference audio ID |
SkipPatterns | []int | No | Patterns to skip |
NewMiniMaxTTS
Panics if Model is empty. Key is optional for supported preset-backed MiniMax models (speech-2.6-turbo, speech_2_6_turbo, speech-2.8-turbo, speech_2_8_turbo). BYOK still requires Key and GroupID, and preset-backed mode must not set GroupID, VoiceID, or URL.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | No | MiniMax API key. Optional for supported preset-backed MiniMax models |
GroupID | string | No | MiniMax group ID. Required for BYOK |
Model | string | Yes | Model name, for example "speech-02-turbo" |
VoiceID | string | No | Voice style identifier. BYOK only |
URL | string | No | WebSocket endpoint. BYOK only |
SkipPatterns | []int | No | Patterns to skip |
NewMurfTTS
Panics if Key or VoiceID is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Murf API key |
VoiceID | string | Yes | Voice ID, for example "Ariana" or "Natalie" |
Style | string | No | Voice style, for example "Conversational" |
SkipPatterns | []int | No | Patterns to skip |
NewSarvamTTS
Panics if Key, Speaker, or TargetLanguageCode is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Sarvam API key |
Speaker | string | Yes | Speaker name |
TargetLanguageCode | string | Yes | Target language code |
SkipPatterns | []int | No | Patterns to skip |
STT vendors
Use with WithStt().
NewDeepgramSTT
Panics if APIKey is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | Deepgram API key |
Model | string | No | Model, for example "nova-2" |
Language | string | No | Language code, for example "en-US" |
SmartFormat | *bool | No | Enable smart formatting |
Punctuation | *bool | No | Enable punctuation |
AdditionalParams | map[string]interface{} | No | Additional vendor parameters |
NewSpeechmaticsSTT
Panics if APIKey is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | Speechmatics API key |
Language | string | No | Language code |
Model | string | No | Model identifier |
NewMicrosoftSTT
Panics if Key or Region is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Azure Speech Services key |
Region | string | Yes | Azure region |
Language | string | No | Language code |
NewOpenAISTT
Panics if APIKey is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | OpenAI API key |
Model | string | No | Model identifier |
Language | string | No | Language code |
NewGoogleSTT
Panics if Key is empty.
| Field | Type | Required | Description |
|---|---|---|---|
Key | string | Yes | Google Cloud API key |
Language | string | No | Language code |
Model | string | No | Model identifier |
NewAmazonSTT
Panics if AccessKey, SecretKey, or Region is empty.
| Field | Type | Required | Description |
|---|---|---|---|
AccessKey | string | Yes | AWS access key |
SecretKey | string | Yes | AWS secret key |
Region | string | Yes | AWS region |
Language | string | No | Language code |
NewAssemblyAISTT
Panics if APIKey is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | AssemblyAI API key |
NewAresSTT
Ares is an Agora built-in STT service — no external API key required.
| Field | Type | Required | Description |
|---|---|---|---|
Language | string | No | Language code |
AdditionalParams | map[string]interface{} | No | Additional vendor parameters |
NewSarvamSTT
Panics if APIKey is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | Sarvam API key |
Language | string | No | Language code |
Model | string | No | Model identifier |
MLLM vendors
Use with WithMllm() for multimodal end-to-end audio processing without separate STT or TTS steps. Requires AdvancedFeatures.EnableMllm = true via WithAdvancedFeatures() in NewAgent.
NewOpenAIRealtime
Panics if APIKey is empty.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
APIKey | string | Yes | — | OpenAI API key |
Model | string | No | "gpt-4o-realtime-preview" | Model identifier |
URL | string | No | — | Custom WebSocket URL |
GreetingMessage | string | No | — | Agent greeting message |
FailureMessage | string | No | — | Message played when the model call fails |
MaxHistory | *int | No | — | Maximum conversation history length |
PredefinedTools | []string | No | — | Predefined tools, for example ["_publish_message"] |
InputModalities | []string | No | — | Input modalities |
OutputModalities | []string | No | — | Output modalities |
Messages | []map[string]interface{} | No | — | Conversation messages for short-term memory |
Params | map[string]interface{} | No | — | Additional parameters |
TurnDetection | *Agora.StartAgentsRequestPropertiesMllmTurnDetection | No | — | MLLM turn detection configuration; overrides top-level turn detection |
NewGeminiLive
Panics if APIKey or Model is empty.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
APIKey | string | Yes | — | Google AI API key |
Model | string | Yes | — | Gemini Live model identifier |
URL | string | No | — | Custom WebSocket URL |
Instructions | string | No | — | System instruction |
Voice | string | No | — | Voice name |
GreetingMessage | string | No | — | Agent greeting message |
FailureMessage | string | No | — | Message played when the model call fails |
MaxHistory | *int | No | — | Maximum conversation history length |
PredefinedTools | []string | No | — | Predefined tools, for example ["_publish_message"] |
InputModalities | []string | No | — | Input modalities |
OutputModalities | []string | No | — | Output modalities |
Messages | []map[string]interface{} | No | — | Conversation messages for short-term memory |
AdditionalParams | map[string]interface{} | No | — | Additional parameters |
TurnDetection | *Agora.StartAgentsRequestPropertiesMllmTurnDetection | No | — | MLLM turn detection configuration; overrides top-level turn detection |
NewVertexAI
Panics if ProjectID or ADCredentialsString is empty.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
ProjectID | string | Yes | — | Google Cloud project ID |
ADCredentialsString | string | Yes | — | Application Default Credentials JSON string |
Location | string | No | "us-central1" | Google Cloud region |
Model | string | No | "gemini-2.0-flash-exp" | Model identifier |
URL | string | No | — | Custom WebSocket URL |
Voice | string | No | — | Voice name |
Instructions | string | No | — | System instruction |
GreetingMessage | string | No | — | Agent greeting message |
FailureMessage | string | No | — | Message played when the model call fails |
MaxHistory | *int | No | — | Maximum conversation history length |
PredefinedTools | []string | No | — | Predefined tools, for example ["_publish_message"] |
InputModalities | []string | No | — | Input modalities |
OutputModalities | []string | No | — | Output modalities |
Messages | []map[string]interface{} | No | — | Conversation messages for short-term memory |
AdditionalParams | map[string]interface{} | No | — | Additional parameters |
TurnDetection | *Agora.StartAgentsRequestPropertiesMllmTurnDetection | No | — | MLLM turn detection configuration; overrides top-level turn detection |
Avatar vendors
Use with WithAvatar(). Each avatar vendor requires a specific TTS sample rate — the constructor panics if the sample rate does not match.
NewHeyGenAvatar
Requires TTS at 24,000 Hz (SampleRate24kHz).
Panics if APIKey or AgoraUID is empty, or if Quality is not "low", "medium", or "high".
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | HeyGen API key |
Quality | string | Yes | Video quality: "low", "medium", or "high" |
AgoraUID | string | Yes | UID for the avatar's video stream |
AgoraToken | string | No | RTC token for avatar authentication |
AvatarID | string | No | HeyGen avatar ID |
Enable | *bool | No | Enable or disable the avatar. Default: true |
DisableIdleTimeout | *bool | No | Disable the idle timeout |
ActivityIdleTimeout | *int | No | Idle timeout in seconds. Default: 120 |
NewAkoolAvatar
Requires TTS at 16,000 Hz (SampleRate16kHz).
Panics if APIKey is empty.
| Field | Type | Required | Description |
|---|---|---|---|
APIKey | string | Yes | Akool API key |
AvatarID | string | No | Avatar ID |
Enable | *bool | No | Enable or disable the avatar |
AdditionalParams | map[string]interface{} | No | Additional vendor parameters |
NewLiveAvatarAvatar
Requires TTS at 24,000 Hz (SampleRate24kHz).
Panics if APIKey or AgoraUID is empty, or if Quality is not "low", "medium", or "high".
NewAnamAvatar
Panics if APIKey is empty.
Token utilities
Helper functions for generating and managing tokens.
GenerateConvoAIToken()
Generates a combined RTC+RTM Conversational AI token. This is the same token the SDK generates automatically when AppID and AppCertificate are provided on AgentSessionOptions.
| Field | Type | Required | Description |
|---|---|---|---|
AppID | string | Yes | Agora App ID |
AppCertificate | string | Yes | Agora App Certificate |
ChannelName | string | Yes | The channel the token grants access to |
Account | string | Yes | The UID this token is issued for, as a string |
TokenExpire | int | No | Token lifetime in seconds. Default: 86400. Valid range: 1–86400 |
GenerateRtcToken()
Generates an RTC-only token. Use GenerateConvoAIToken() instead for most Conversational AI use cases.
| Field | Type | Required | Description |
|---|---|---|---|
AppID | string | Yes | Agora App ID |
AppCertificate | string | Yes | Agora App Certificate |
Channel | string | Yes | Channel name |
UID | uint32 | Yes | User ID. Use 0 for any user |
Role | int | No | RTC role: RolePublisher (1) or RoleSubscriber (2). Default: RolePublisher |
ExpirySeconds | int | No | Token lifetime in seconds. Default: DefaultExpirySeconds (3600) |
ExpiresInHours() / ExpiresInMinutes()
Helper functions for specifying token lifetimes. Use with AgentSessionOptions.ExpiresIn or token generation functions. Returns an error if the value is ≤ 0; warns and caps at 86400 if the result exceeds 24 hours.
Types and constants
Shared types, constants, and enums used across the SDK.
SessionStatus
Typed string constants representing the session lifecycle states. Read via session.Status().
SampleRate
Typed integer constants for audio sample rates. Use with TTS vendor SampleRate fields and avatar sample rate validation.
Convenience constants for avatar sample rate requirements:
EventHandler
The function signature for session event handlers. Pass implementations to session.On().
| Event | data type | Cast example |
|---|---|---|
"started" | map[string]string | data.(map[string]string)["agent_id"] |
"stopped" | map[string]string | data.(map[string]string)["agent_id"] |
"error" | error | data.(error) |
Area constants
Used with option.WithArea() to select the regional API endpoint.
Type aliases
The agentkit package defines type aliases for common Fern-generated types. Use these in place of the full Agora.StartAgentsRequestProperties* names when building configuration objects.
| Alias | Underlying type |
|---|---|
TurnDetectionConfig | Agora.StartAgentsRequestPropertiesTurnDetection |
SalConfig | Agora.StartAgentsRequestPropertiesSal |
AdvancedFeatures | Agora.StartAgentsRequestPropertiesAdvancedFeatures |
SessionParams | Agora.StartAgentsRequestPropertiesParameters |
GeofenceConfig | Agora.StartAgentsRequestPropertiesGeofence |
RtcConfig | Agora.StartAgentsRequestPropertiesRtc |
FillerWordsConfig | Agora.StartAgentsRequestPropertiesFillerWords |
LlmConfig | Agora.StartAgentsRequestPropertiesLlm |
MllmConfig | Agora.StartAgentsRequestPropertiesMllm |
AsrConfig | Agora.StartAgentsRequestPropertiesAsr |
TtsConfig | Agora.Tts |
AvatarConfig | Agora.StartAgentsRequestPropertiesAvatar |
AgoraError
The Fern-generated error type returned when the API responds with a 4xx or 5xx status code. Use errors.As to inspect the error.
| Field | Type | Description |
|---|---|---|
StatusCode | int | HTTP status code returned by the API |
Body | string | Raw response body from the API |