Migrate from Real-Time STT 6.x to 7.x
This document guides you through migrating from version 6.x of Real-Time STT to the latest version 7.x. The updated architecture delivers improved stability, a simplified API workflow, and enhanced functionality to support a broader range of scenarios.
Thoroughly test your implementation during the migration process and prepare a fallback plan. Complete verification in a test environment before deploying it to production.
Version differences and upgrade advantages
Version history
Version 7.x is the latest recommended release. It offers several key improvements over version 6.x:
- Simplified API workflow: Removes the need to obtain a
builderTokenthrough theacquireinterface. - Consistent API naming: Aligns AI product APIs with the Convo AI RESTful API naming conventions.
- Extended functionality: Adds improved language recognition configuration at the UID level.
- Standardized URL paths: Keeps the domain
api.agora.ioand standardizes the path structure.
Major changes
API endpoint and method name changes
| 6.x | 7.x | Summary |
|---|---|---|
acquire | This method has been removed. | The process of obtaining a builderToken through the acquire interface has been removed in version 7.x. |
start | join | The method for starting an STT task has been renamed to join. |
query | get | The method for querying task status has been renamed to get. |
stop | leave | Method for stopping a task has been renamed to leave. |
update | update | The method name remains unchanged. |
URL path changes
| API | 6.x URL | 7.x URL |
|---|---|---|
| Start task | https://api.agora.io/v1/projects/{{appId}}/rtsc/speech-to-text/tasks?builderToken={{tokenName}} | https://api.agora.io/api/speech-to-text/v1/projects/{appid}/join |
| Query status | https://api.agora.io/v1/projects/{{appId}}/rtsc/speech-to-text/tasks/{{taskId}}?builderToken={{tokenName}} | https://api.agora.io/api/speech-to-text/v1/projects/{appid}/agents/{agent_id} |
| Stop task | https://api.agora.io/v1/projects/{{appId}}/rtsc/speech-to-text/tasks/{{taskId}}?builderToken={{tokenName}} | https://api.agora.io/api/speech-to-text/v1/projects/{appid}/agents/{agent_id}/leave |
| Update configuration | https://api.agora.io/v1/projects/{{appId}}/rtsc/speech-to-text/tasks/{{taskId}}?builderToken={{tokenName}} | https://api.agora.io/api/speech-to-text/v1/projects/{appid}/agents/{agent_id}/update |
Parameter changes
-
Task identifier changes
- 6.x: Uses
taskId toidentify tasks - 7.x: Uses
agent_idto identify tasks
- 6.x: Uses
-
Authentication method changes
- 6.x: Need to call
acquireto obtain abuilderTokenfirst. Subsequent requests were passed through URL parameters. - 7.x: Removes the process of obtaining a
builderTokenthrough theacquireinterface, simplifying the API call process.
- 6.x: Need to call
Request parameter changes
-
Parameters added to the
joinmethod (formerlystart):name: Required. A unique task name of up to 64 characters. This parameter is used for task deduplication to ensure that only one speech-to-text task runs in the same channel. To run multiple speech-to-text tasks in a channel, set differentnamevalues.
Other changes
- URL path standardization: Version 7.x uses the same domain name
api.agora.iobut introduces a more standardized path structure, such as/api/speech-to-text/v1/projects/{appid}/. - HTTP method standardization: The
stopmethod now usesPOSTinstead ofDELETEto better comply with RESTful conventions. - Return value format: Version 7.x return values include more detailed task status information.
Migrate to 7.x
This section guides you through the migration process.
Migration steps
The migration steps are as follows:
Update the API call process
In version 5.x, the basic process was as follows:
- Call
acquireto get abuilderToken. - Use the
builderTokento callstartand begin the task. - Call
queryto check the task status. - Call
stopto end the task. - Call
updateto modify the task configuration (optional).
For version 7.x, modify the API call sequence as follows:
- Call
jointo start the task and get theagent_id. - Call
getto check the task status. - Call
leaveto stop the task. - Call
updateto modify the task configuration (optional).
Update the URL and authentication method
- Remove all calls to
acquireand thebuilderTokenprocessing logic. - Update the API path from
/v1/projects/{{appId}}/rtsc/speech-to-text/to/api/speech-to-text/v1/projects/{appid}/. - Update the request path and parameters according to the new API specification.
- Add the
nameparameter to thejoinrequest (required).
Update the task identifier
Update all uses of taskId in the code to agent_id and ensure that subsequent operations use the correct task identifier.
Update response processing
Update the response processing logic to adapt to the new return format:
- Successful calls to
join,get, andupdatereturn an object containingstatus,create_tsandagent_id. - Failed calls to
join,get, andupdatereturnreasonanddetailfields.
Code comparison
Comparison between 6.x and 7.x code:
6.x
7.x code
Migration checklist
- Remove all code related to the
acquiremethod. - Update all API calls to use the latest URLs.
- Replace
startwithjoin. - Replace
querywithget. - Replace
stopwithleaveand change the HTTP method fromDELETEtoPOST. - Update the task identifier from
taskIdtoagent_id. - Update request parameters, add the
namefield required for start - Update the response-handling logic.
- Add error-handling mechanisms.
- Test all updated API calls.
- Prepare a fallback plan.
Troubleshooting
If you encounter problems during the migration process, refer to the following table:
| Error description | Possible cause | Recommended action |
|---|---|---|
| Authentication failed | Missing or incorrect authentication header or incorrect parameters. | Check that the auth configuration is correct and that the app ID is enabled for the new service. |
| Task cannot be started | Incorrect request parameter format. | Verify the parameter format against the documentation; ensure languages is an array. |
| Task status cannot be queried | Incorrect agent_id or the task has already ended. | Use the correct agent_id returned by the join interface. |
| Audio subscription issues | Subscription config doesn't match UID in the channel. | Check that subscribeAudioUids matches actual UIDs in the channel. |
| Recognition result problem | Incorrect language configuration. | Ensure that the languages array includes a valid language code, such as ["en-US"] |
FAQs
Is version 7.x compatible with version 6.x?
No, version 7.x is not compatible with version 6.x. Version 7.x refactored the API to provide more stable services, simplified API processes, and added new features such as language identification at the UID level. Code migration is required following this guide.
Do I need to keep the logic to obtain a builderToken from version 6.x?
No. Version 7.x removes the process of obtaining builderToken through the acquire interface, and this parameter is no longer required for API calls.
Does version 7.x support string-type UIDs?
String-type UIDs are not supported. The current version still uses integer-type UIDs, but they are passed as strings in the API for future compatibility.
Are there any changes to the task status field?
The status field values remain the same, but the API method and response format for obtaining task status have changed.
Do I need to update call frequency or limits?
Basic call rates and limits remain the same. Refer to the latest documentation for detailed limit specifications.
Are there changes to error codes returned by API requests?
Yes, version 7.x has standardized error codes. See Update response processing for details.
How can I verify that the new API works correctly after migration?
Follow these steps to validate functionality after migration:
- Start a task and obtain the
agent_id - Query the task status to confirm it is running properly
- Conduct speech recognition tests and compare recognition results
- Test task stopping and configuration update functions
How can I ensure smooth migration?
Use the following approach to ensure smooth migration:
- Complete migration and testing in a staging environment
- Prepare a fallback plan and temporarily maintain the old version code
- Use a phased rollout, starting with non-critical services
- Monitor the new API's performance and error rate before full deployment