Skip to main content

Release notes

Releases

v1.3

This version was released on April 16, 2025.

New features

  • Agent conversation history: This version adds two methods to retrieve an agent’s history. The history includes messages exchanged between the user and the agent and timestamps of agent creation and exit.

Improvements

  • Customize the priority of broadcast information: This version upgrades the Broadcast a message using TTS interface and adds two new configuration parameters related to broadcast interruption logic:

    • priority: Sets the priority of the message broadcast. Supports setting the following priorities:

      • INTERRUPT High priority
      • APPEND: Medium priority
      • IGNORE: Low priority
    • interruptable: Configure whether to allow human voice to interrupt the agent's broadcast.

API Changes

v1.2

This version was released on April 10, 2025.

New features

  • Broadcast a message using TTS: A new message broadcast interface enables a specified agent to deliver a custom message. When interacting with an agent, calling this interface interrupts the agent’s speech and thinking process, allowing the TTS module to immediately broadcast the custom message.

  • Interrupt the agent: The interrupt agent endpoint allows you to stop the specified agent’s speech and thinking process.

API Changes

This version adds the following APIs:

v1.1

This version was released on March 27, 2025.

New features

The Start a conversational agent API adds the enable_rtmand agent_rtm_uid parameters to enable Signaling integration with Conversational AI agent. When this feature is enabled, the agent can leverage the Signaling SDK to obtain a users's custom context information such as speaking status, selected text, signature, and score, and pass this data to the agent to generate more relevant content. For details, see Transmit custom information.

Improvements

To help you quickly integrate a custom large language model (LLM), this version adds documentation for Custom LLMs. Refer to the sample code in the documentation to integrate your custom model into the conversational AI engine and enable advanced capabilities such as Retrieval-Augmented Generation (RAG), multi-modal processing, and tool invocation.

API Changes

The POST method to Start a Conversational AI agent now includes the enable_rtm and agent_rtm_uid fields.

v1.0 (Public Beta)

This version, released on March 4, 2025, adds pricing information for the Agora Conversational AI Engine. For more information, see Pricing.

Integration guide

To achieve the best conversation experience, use Agora Conversational AI Engine with the following Agora SDKs:

  • Agora RTC Native SDK, v4.5.1 or later.
  • Agora RTC Web SDK, version 4.23.2 or later.

New features

  • Live subtitles: Supports real-time text output of conversations between users and the AI agent for subtitle display in your app's UI. Agora provides an open-source subtitle processing module. Simply integrate the module and call its API to implement live subtitles. For details, see Display live subtitles.

  • Message Notification Service: Introduces a new Conversational AI Engine message notification service. Configure it in the Agora console and subscribe to agent creation, stop, and error events. When a subscribed event occurs, Agora sends the details to your specified callback address. See Receive event notifications.

  • Keywords: Enhances recognition accuracy of Conversational AI Engine for proprietary words by adding keywords. This feature is currently in Beta stage. For details, contact technical support.

v1.0 (Private Beta)

This version was released on February 18, 2025. The first beta release of the Conversational AI Engine brings natural, smooth, low-latency, and highly reliable real-time voice conversations with AI agents to Agora channels. It enables you to efficiently build intelligent and immersive interactive experiences. See Product overview for details.

Core Features

  • Real-time voice conversation

    Supports natural and smooth real-time voice conversations with AI. It delivers a low-latency, ultra-responsive interactive experience as if the user is communicating with a real person.

  • Intelligent noise suppression

    Intelligently identifies and suppresses background noise, ensuring clear sound transmission even in noisy environments to provide users with a high-quality audio experience.

  • Background human voice suppression

    Suppresses background voices and noise while accurately preserving the primary speaker's voice. This ensures a clear and focused interactive experience in multi-speaker environments.

  • Intelligent interruption handling

    Allows users to interrupt AI at any time to ensure quick and natural responses. This feature enables smooth transitions and avoids mechanical interactions.

  • Intelligent transmission

    An AI-optimized transmission algorithm ensures stable voice data delivery even in weak network conditions where packet loss reaches 80%. This guarantees conversation continuity and reliability across diverse network environments.

  • Flexible arrangement

    Supports multiple Large Language Model (LLM) and Text-to-Speech (TTS) providers, enabling flexible orchestration to meet diverse business needs and deliver highly customizable AI dialogue solutions.

  • Multi-platform support

    Compatible with iOS, Android, Web, and various embedded hardware platforms, providing a seamless and consistent cross-platform experience.

Integration guide

  • For the best conversational experience, Agora recommends using Conversational AI Engine with specific Agora Video/Voice SDK versions. For details, contact technical support.

  • The number of Peak Concurrent Users (PCU) allowed to call the server API under a single App ID is limited to 20. If you need to increase this limit, please contact technical support.

vundefined