Real-Time Speech to Text
Agora's Real-Time Speech to Text (STT) transcribes live voice streams to deliver closed captions and transcription for enhanced accessibility. With advanced features like silent audio removal, it optimizes performance and reduces costs.
Transcribed text can be translated into multiple languages in real-time or used as input for AI models like GPT, seamlessly connecting real-time engagement with AI-powered applications.
Product Features
Cloud-based STT
Cloud-based service converts voice to text for active or specific hosts and then distributes the text to all participants in the channel for further processing. The service does not depend on the client's device performance and network conditions.
Speaker labeling
Label each transcribed text with the speaker's UID. Separate transcription of each host ensures accuracy even when multiple hosts are talking simultaneously.