Google Gemini Live
Google Gemini Live provides multimodal large language model capabilities with real-time audio processing, enabling natural voice conversations without separate ASR/TTS components.
Enable MLLM
To enable MLLM functionality, set enable_mllm to true under advanced_features.
Sample configuration
The following example shows a starting mllm parameter configuration you can use when you Start a conversational AI agent.
Key parameters
mllmrequired
- params objectrequired
Main configuration object for the Gemini Live model.
Show propertiesHide properties
- model stringrequired
The Gemini Live model identifier.
- adc_credentials_string stringrequired
Base64-encoded Google Cloud Application Default Credentials (ADC).
- project_id stringrequired
Your Google Cloud project ID for Vertex AI access.
- location stringrequired
The Google Cloud region hosting the Gemini Live model. Check the Google Cloud documentation for the full list of available regions.
- instructions stringnullable
System instructions that define the agent’s behavior or tone.
- messages array[object]nullable
Optional array of conversation history items used for short-term memory.
- voice stringnullable
The voice identifier for audio output. For example, "Aoede", "Puck", "Charon", "Kore", "Fenrir", "Leda", "Orus", or "Zephyr".
- transcribe_agent booleannullable
Whether to transcribe the agent’s speech in real time.
- transcribe_user booleannullable
Whether to transcribe the user’s speech in real time.
- input_modalities array[string]nullable
Default:
["audio"]Input modalities for the MLLM.
["audio"]: Audio-only input["audio", "text"]: Accept both audio and text input
- output_modalities array[string]nullable
Default:
["audio"]Output modalities for the MLLM.
["audio"]: Audio-only response["text", "audio"]: Combined text and audio output
- greeting_message stringnullable
Initial message the agent speaks when a user joins the channel. Example:
"Hi, how can I assist you today?". - vendor stringrequired
MLLM provider identifier. Set to
"vertexai"for Google Gemini Live. - style stringrequired
API request style. Set to
"openai"for OpenAI-compatible request formatting.
For comprehensive API reference, real-time capabilities, and detailed parameter descriptions, see the Google Gemini Live API.