Migrate from Real-Time STT 5.x
Version 5.x of Real-Time STT focused on flexibility, reserving many parameters and capabilities for future expansion, such as cross-channel subtitle push. However, this flexibility introduced complexity in integration. In contrast, v6.x prioritizes simplicity by removing redundant fields while also introducing new features, such as supporting UID language functions and eliminating destination fields.
The Real-Time STT v5.x has entered a frozen development state. Agora strongly recommends migrating to v6.x.
This page explains how to migrate from v5.x to v6.x.
What has changed
The Real-Time STT v6.x introduces a more streamlined approach by removing certain functions that were available in v5.x. Here’s what has changed:
-
Cross-channel audio subscription and subtitle push removed
- v5.x: You could subscribe to audio from Channel A and push the transcribed or translated subtitles to Channel B.
- v6.x: This is no longer supported. The Real-Time STT v6.x only allows subscribing to audio within the same channel and pushing subtitles to that same channel.
-
Multi-OSS/S3 storage for subtitle files removed
- v5.x: You could store subtitle files in multiple OSS/S3 storage locations.
- v6.x: You can only upload subtitles to a single OSS/S3 storage location.
-
Redundant fields removed
- Some unnecessary fields such as
config.recognizeConfig.model
andaudio.subscribeSource
have been removed to simplify real-time REST API.
- Some unnecessary fields such as
These changes are aimed at reducing complexity and making real-time STT more efficient.
Migrate to v6.x manually
The following figure highlights the v5.x configuration changes.
-
Channel and user authentication changes
- v5.x:
channelName
,uid
,token
are specified underagoraRtcConfig.
- v6.x: Replaced by
channelName
,subBotUid
, andsubBotToken
inside thertcConfig
structure.
- v5.x:
-
Idle time setting
- v5.x:
maxIdleTime
is specified underagoraRtcConfig.
- v6.x: Replaced by the
maxIdleTime
field at the top level of the structure.
- v5.x:
-
Language configuration
- v5.x:
language
is a comma-separated string for multiple languages underrecognizeConfig
. - v6.x: Replaced by
languages
array at the top level. Each language is now an individual item in the array.
- v5.x:
-
Publisher bot authentication
- v5.x:
uid
andtoken
are specified underagoraRTCDataStream
. - v6.x: Replaced by
pubBotUid
andpubBotToken
in thertcConfig
structure.
- v5.x:
-
Storage configuration changes
- v5.x:
storageConfig
is underconfig.recognizeConfig.output.cloudStorage
- v6.x: Replaced by
captionConfig.storage
, with all fields mapped one-to-one.
- v5.x:
-
Translation configuration
- v5.x:
translateConfig
structure is underconfig
- v6.x: Directly replaced by the
translateConfig
structure at the top level.
- v5.x:
The equivalent configuration for v6.x is as follows:
Migrate using a Go script
To convert a v5.x REST API to v6.x, Agora provides a Go script. To use this script:
- Replace the highlighted area in the rectangular box with your v5.x REST API.
- Click Run to generate the v6.x request body.
- Use the v6.x REST API to start an STT task and compare it with the v5.x version to check for differences.