Pricing
Agora calculates the billing for all projects under your Agora account on a monthly basis. Billing begins once you enable Real-Time STT.
This page explains Agora's billing policy for the Real-Time STT add-on.
Your billing details may differ if you have signed a contract with Agora.
Transcription fee
When Real-Time STT is enabled for a channel, it transcribes the audio of the active hosts. When Real-Time STT is enabled for specific hosts, it only transcribes the audio of the specified hosts and ignores others. The Real-Time STT service employs algorithms that remove periods of silence and improve Word Error Rate (WER) of transcription. The processed audio is transcribed by the Real-Time STT engine and its duration is referred to as the transcription duration. Agora charges for the transcription duration of all or specified hosts in the channel. The unit price is as follows:
Billing item | Usage, minutes per month | Pricing, US$/1,000 minutes |
---|---|---|
Transcription duration | Above 0 | 16.99 |
Example
After you enable Real-Time STT:
- Host A speaks for 2 minutes and remains silent for 8 minutes.
- Host B speaks for 3 minutes and remains silent for 7 minutes.
- Host C speaks for 3 minutes and remains silent for 7 minutes.
- All hosts are silent for the first 2 minutes of the call.
In this case, the total transcription minutes are calculated as 2 (Host A) + 3 (Host B) + 3 (Host C) = 8 minutes. The silent periods of each host, including the time spent listening to others, are not counted towards the transcription duration.
- WER is a measure of the accuracy of an STT engine - the lower, the better.
- Real-Time STT does not incur additional RTC audio fee.
- Enabling Real-Time STT for channels or hosts that are silent for long periods is not recommended. In the example, during the first 2 minutes, the Real-Time STT worker processes all hosts' audio to remove silent portions. In this case, Agora charges for the first 2 minutes, and the STT engine standby time is billed at $0.99/1,000 minutes with the same discount applied as for RTC audio.
Language identification fee
Real-Time STT supports dynamic language detection when two or more languages are enabled for a channel or specific hosts. The Language Identification (LID) duration is the same as the transcription duration.
Billing item | Usage, minutes per month | Pricing, US$/1,000 minutes |
---|---|---|
Language identification duration | Above 0 | 5.00 |
Example
- Suppose a channel exists for 10 minutes. There are three active, unmuted hosts, A, B, and C.
- If Spanish and Chinese LID is enabled for this channel at the start, the algorithm removes 8 minutes of silent audio for host A, 7 minutes for host B and 7 minutes for host C. Therefore, the transcription duration is 2 + 3 + 3 = 8 minutes. The LID duration is also 8 minutes, being the sum of 2 minutes for host A, 3 minutes for host B, and 3 minutes for host C.
- If Spanish and Chinese LID is enabled for host A, then the transcription duration and LID duration are both 2 minutes.
Notes:
- The Real-Time STT transcription duration does not change if you enable more than one language.
- If only one language is set for a channel or a specified host, language detection does not start.
Translation fee
When you enable Real-Time Translation (Beta) for a channel or a user, transcription is activated first. The transcription text is then translated to the target languages. The translation usage minutes are the same as the transcription usage minutes. The real-time transcription and translation usage and cost is shown in your monthly invoice. The unit price is as follows:
Billing item | Pricing, US$/1,000 minutes/Language |
---|---|
Translation | 8.99 |
Example
After you enable Real-Time STT:
- Host A speaks Russian for 2 minutes and remains silent for 8 minutes.
- Host B speaks French for 3 minutes and remains silent for 7 minutes.
- Host C speaks Russian for 3 minutes and remains silent for 7 minutes.
- All hosts are silent for the first 2 minutes of the call.
- Russian and French are translated to English.
In this case, the total transcription minutes are calculated as 2 (Host A) + 3 (Host B) + 3 (Host C) = 8 minutes. The translation minutes are the same as transcription minutes. Agora charges 8 minutes transcription fee and 8 minutes translation fee.
Total cost = 8/1000*$16.99 + 8/1000*$8.99 = $0.136 + $0.072 = $0.208.
If you translate Russian and French to English and German, the translation cost is multiplied by 2:
Total cost = 8/1000*$16.99 + 8/1000*$8.99*2 = $0.136 + $0.144 = $0.28.
Free-of-charge duration
Real-Time STT and Real-Time Translation share 300 free-of-charge minutes for integration and testing purposes.
- 1 minute of Real-Time STT equals 1 free-of-charge minute.
- 1 minute of Real-Time Translation equals 8.99/16.99 free-of-charge minute.
Contact sales@agora.io or your AE to get a discount.