Skip to main content

Record captions

Real-Time STT enables you to record caption text, generate a .vtt file, and store it in the configured cloud storage. These recorded captions are designed for seamless playback alongside audio and video recordings. Additionally, you can use the .vtt file to generate meeting minutes and topic summaries, perform sentiment analysis and content moderation. Caption recording does not incur any additional fees.

This page explains how to record captions and synchronize them with the audio or video files generated by Cloud Recording.

info

Since Cloud Recording and transcription tasks operate on different servers, you must enable the NTP timestamp for Cloud Recording to synchronize the timestamps with the transcription.

Prerequisites

To follow this procedure, you must:

  • Make sure Real-Time STT is enabled for your app.

  • Set the enableNTPtimestamp parameter to true in the body of the Cloud Recording start request.

    Sample Cloud Recording start request body

    _26
    "recordingConfig": {
    _26
    "maxIdleTime": 30,
    _26
    "streamTypes": 2,
    _26
    "channelType": 1,
    _26
    "extensionParams": {
    _26
    "enableNTPtimestamp": true
    _26
    },
    _26
    "transcodingConfig": {
    _26
    "width": 1280,
    _26
    "height": 720,
    _26
    "fps": 15,
    _26
    "bitrate": 2400,
    _26
    "mixedVideoLayout": 3,
    _26
    "layoutConfig": [
    _26
    {
    _26
    "alpha": 1,
    _26
    "height": 1,
    _26
    "render_mode": 1,
    _26
    "uid": "22228",
    _26
    "width": 1,
    _26
    "x_axis": 0,
    _26
    "y_axis": 0
    _26
    }
    _26
    ]
    _26
    }
    _26
    },

Implementation

Follow these steps to record captions and synchronize them with the corresponding recordings for seamless playback.

Record the captions

To record the captions, follow the API call sequence from the REST Quickstart and modify the start request to include caption recording parameters as follows:


_21
curl --location -g 'https://api.agora.io/v1/projects/{{appId}}/rtsc/speech-to-text/tasks?builderToken={{tokenName}}' \
_21
--header 'Content-Type: application/json' \
_21
--data '{
_21
{{
_21
"languages": [
_21
"<YourTranscribeLanguages>"
_21
],
_21
"maxIdleTime": 50,
_21
"rtcConfig": {
_21
// No change on rtcConfig.
_21
},
_21
"captionConfig": { // The caption recording configurations.
_21
"storage": {
_21
"accessKey": "<YourOssAccessKey>", // The OSS access key.
_21
"secretKey": "<YourOssSecretKey>", // The OSS secret key.
_21
"bucket": "<YourOssBucketName>", // The storage bucket name.
_21
"vendor": <YourOssVendor>, // The OSS vendor.
_21
"region": <YourOssRegion>, // The OSS region.
_21
"fileNamePrefix": ["<YourOssPrefix>"] // Optional, an array of strings. Sets the path of the recorded files in the third-party cloud storage.
_21
}
_21
},

Sync the files

The m3u8+vtt file generated by Real-Time STT and the m3u8+ts file generated by Cloud Recording are two independent files, with different time stamps. The Cloud Recording time stamp starts at 0, while Real-Time STT uses the system time stamp. If either process starts abnormally, the media files generated by the two services may be out of sync during playback.

Agora provides a post-processing script that lets you sync the m3u8+ts and m3u8+vtt files. To run this script, take the following steps:

  1. Unzip the post-processing script to a local folder.

  2. Run the script on your transcription files:


    _1
    python3 insert_subtitle.py --av audio_dir/audio_ts.m3u8 --subtitle subtitle_dir/subtitle.m3u8 --output output_dir/ --overwrite

    If ffmpeg/ffprob are not in your PATH, use -ffmpeg_path to specify the path.

  3. Play the synchronized files:

    1. Run the following command to start the HTTP server:


      _1
      python3 -m http.server --bind 127.0.0.1 -d output_dir

    2. In your browser, enter the following URL:


      _1
      http://127.0.0.1:8000/player_demo.html

Supported OSS vendors

vendor: Number. The third-party cloud storage vendor. The following are supported:

region: Number. The region information specified for the third-party cloud storage. The service only supports regions in the following lists:

  • Amazon S3 (vendor = 1):

    • 0: US_EAST_1
    • 1: US_EAST_2
    • 2: US_WEST_1
    • 3: US_WEST_2
    • 4: EU_WEST_1
    • 5: EU_WEST_2
    • 6: EU_WEST_3
    • 7: EU_CENTRAL_1
    • 8: AP_SOUTHEAST_1
    • 9: AP_SOUTHEAST_2
    • 10: AP_NORTHEAST_1
    • 11: AP_NORTHEAST_2
    • 12: SA_EAST_1
    • 13: CA_CENTRAL_1
    • 14: AP_SOUTH_1
    • 15: CN_NORTH_1
    • 16: CN_NORTHWEST_1
    • 17: US_GOV_WEST_1
    • 20: AP_NORTHEAST_3
    • 21: EU_NORTH_1
    • 22: ME_SOUTH_1
    • 23: US_GOV_EAST_1
    • 24: AP_SOUTHEAST_3
    • 25: EU_SOUTH_1
    • 28: IL_CENTRAL_1
  • Alibaba Cloud (vendor = 2):

    • 0: CN_Hangzhou
    • 1: CN_Shanghai
    • 2: CN_Qingdao
    • 3: CN_Beijing
    • 4: CN_Zhangjiakou
    • 5: CN_Huhehaote
    • 6: CN_Shenzhen
    • 7: CN_Hongkong
    • 8: US_West_1
    • 9: US_East_1
    • 10: AP_Southeast_1
    • 11: AP_Southeast_2
    • 12: AP_Southeast_3
    • 13: AP_Southeast_5
    • 14: AP_Northeast_1
    • 15: AP_South_1
    • 16: EU_Central_1
    • 17: EU_West_1
    • 18: EU_East_1
    • 19: AP_Southeast_6
    • 20: CN_Heyuan
    • 21: CN_Guangzhou
    • 22: CN_Chengdu
    • 23: CN_Nanjing
    • 24: CN_Fuzhou
    • 25: CN_Wulanchabu
    • 26: CN_Northeast_2
    • 27: CN_Southeast_7

    For details, see Alibaba Cloud documentation.

  • Tencent Cloud (vendor = 3):

    • 0: AP_Beijing_1
    • 1: AP_Beijing
    • 2: AP_Shanghai
    • 3: AP_Guangzhou
    • 4: AP_Chengdu
    • 5: AP_Chongqing
    • 6: AP_Shenzhen_FSI
    • 7: AP_Shanghai_FSI
    • 8: AP_Beijing_FSI
    • 9: AP_Hongkong
    • 10: AP_Singapore
    • 11: AP_Mumbai
    • 12: AP_Seoul
    • 13: AP_Bangkok
    • 14: AP_Tokyo
    • 15: NA_Siliconvalley
    • 16: NA_Ashburn
    • 17: NA_Toronto
    • 18: EU_Frankfurt
    • 19: EU_Moscow
  • Kingsoft Cloud (vendor = 4):

    • 0: CN_Hangzhou
    • 1: CN_Shanghai
    • 2: CN_Qingdao
    • 3: CN_Beijing
    • 4: CN_Guangzhou
    • 5: CN_Hongkong
    • 6: JR_Beijing
    • 7: JR_Shanghai
    • 8: NA_Russia_1
    • 9: NA_Singapore_1
  • Microsoft Azure (vendor = 5): The region parameter has no effect, whether set or not.

  • Google Cloud (vendor = 6): The region parameter has no effect, whether set or not.

  • Huawei Cloud (vendor = 7):

    • 0: CN_North_1
    • 1: CN_North_4
    • 2: CN_East_2
    • 3: CN_East_3
    • 4: CN_South_1
    • 5: CN_Southwest_2
    • 6: AP_Southeast_1
    • 7: AP_Southeast_2
    • 8: AP_Southeast_3
    • 9: AF_South_1
    • 10: SA_Argentina_1
    • 11: SA_Peru_1
    • 12: NA_Mexico_1
    • 13: SA_Brazil_1
    • 14: LA_South_2
    • 15: SA_Chile_1
vundefined