Core concepts
Agora Cloud Recording enables you to record video and voice calls or streams in the cloud for storage or on-demand viewing. Cloud Recording works with Voice Calling, Video Calling, Broadcast Streaming and Interactive Live Streaming.
This page introduces the key processes and concepts you need to know to use Cloud Recording.
General concepts
Agora relies on the following fundamental concepts to enable seamless real-time communication:
Agora SD-RTN™
Agora's core engagement services are powered by its Software-Defined Real-time Network (SD-RTN™), a global infrastructure accessible anytime, anywhere. Unlike traditional networks, Agora SD-RTN™ is not restricted by devices, phone numbers, or telecom coverage areas. With data centers in over 200 countries and regions, it ensures sub-second latency and high availability for real-time media.
Agora SD-RTN™ enables live user engagement through real-time communication (RTC), offering:
- Unmatched quality of service
- High availability and accessibility
- True scalability
- Low cost
App ID
The App ID is a unique key generated by Agora to identify each project and provide billing and other statistical data services. The App ID is critical for connecting users within your app. It is used to initialize the Agora Engine in your app, and as one of the required keys to create authentication tokens for secure communication. Retrieve the App ID for your project using the Agora Console.
App IDs are stored on the front-end client and do not provide access control. Projects using only an App ID allow any user with the App ID to join. For access control, especially in production environments, choose the App ID + Token mechanism for user authentication when creating a new project. Without authentication tokens, your environment is open to anyone with access to your App ID.
App Certificate
An App Certificate is a unique key generated by the Agora Console to secure projects through token authentication. It is required, along with the App ID, to generate a token that proves authorization between your systems and Agora's network. App Certificates are used to generate Cloud Recording authentication tokens.
Store the App Certificate securely in your backend systems. If your App Certificate is compromised or to meet security compliance requirements, you can invalidate certificates and create new ones through the Agora Console.
Tokens
A token is a dynamic key generated using the App ID, App Certificate, user ID, and expiration timestamp. Tokens authenticate and secure access to Agora's services, ensuring only authorized users can join a channel and participate in real-time communication.
Tokens are generated on your server and passed to the client for use in Cloud Recording. The token generation process involves digitally signing the App ID, App Certificate, user ID, and expiration timestamp using a specific algorithm, preventing tampering or forgery.
During development and testing, use the Agora Console to generate temporary tokens. For production environments, implement a token server as part of your security infrastructure to control access to your channels.
For information on setting up a token server for generating and managing tokens, refer to the guide on Secure authentication with tokens.
Channel
In Agora's platform, a channel is a way of grouping users together and is identified by a unique channel name. Users who connect to the same channel can communicate with each other. A channel is created when the first user joins and ceases to exist when the last user leaves.
Channels are created by calling the methods for transmitting real-time data. Agora uses different channels to transmit different types of data:
- A Video SDK channel is used for transmitting audio or video data.
- A Signaling channel is used for transmitting messaging or signaling data.
These channels are independent of each other.
Additional services provided by Agora, such as Cloud Recording and Speech to Text, join the Cloud Recording channel to provide real-time recording, transmission acceleration, media playback, and content moderation.
User ID
In Cloud Recording, the UID is an integer value that uniquely identifies a user within the context of a channel. When joining a channel, you have the option to either assign a specific UID to the user or pass 0
or null
and allow Agora to automatically generate and assign a UID to the user. If two users attempt to join the same channel with the same UID, it can lead to unexpected behavior.
The UID is used by Agora's services and components to identify and manage users within a channel. Ensure that UIDs are properly assigned to prevent conflicts.
Agora Console
To use Agora Cloud Recording, create a project in the Agora Console first.
Agora Console provides an intuitive interface for developers to query and manage their Agora account. After registering an Agora account, you use the Agora Console to perform the following tasks:
- Manage your account
- Create and configure Agora projects and services
- Get an App ID and the App certificate
- Generate temporary tokens for development and testing
- Manage members and roles
- Check call quality and usage
- Check bills and make payments
- Access product resources
See Agora account management for details on how to manage all aspects of your Agora account.
Agora also provides RESTful APIs that you use to implement features such as creating a project and fetching usage numbers programmatically.
Cloud recording concepts
Recording modes
Agora Cloud Recording supports three recording modes:
- Individual recording
- Composite recording
- Web page recording
After the recording is complete, the recorded content is uploaded as a TS
file to the third-party cloud storage you specified. An M3U8
file is also generated to serve as an index for the corresponding TS
file
The working principles of different recording modes and the types of files generated by Cloud Recording are as follows:
Individual recording
In individual recording, the recording service records the audio and video streams of each UID in the channel separately. After the recording is complete, the recording service generates the corresponding audio and video files for each UID.
For example, if there are 3 UIDs in the channel and each UID sends audio and video, then in the individual recording mode, 3 audio files and 3 video files are generated.
Composite recording
In mixed recording, the recording service combines the audio and video of multiple UIDs in the channel into a single audio and video file.
For example, if there are 3 UIDs in the channel and each sends audio and video, the mixed recording mode generates one recording file that includes the audio and video of all UIDs.
Web page recording
In web page recording, the recording service combines the page content and audio of a specified web page into an audio and video file.
Web page recording is commonly used in the following use-cases:
- In online classrooms, to record the teacher and student audio and video along with courseware, whiteboard, and other visuals.
- In video conferences, to capture participants' audio and video, as well as whiteboard, PPT, and other visuals.
Transcoding and non-transcoding modes
In individual recording, audio transcoding and non-transcoding modes have different use cases and characteristics.
Individual recording with transcoding: This mode is used in use-cases where unified audio encoding parameters are needed to ensure consistent recording file formats and parameters for easier post-processing and playback. It is commonly used in cases requiring high compatibility and standardized output, such as wide player support and standardized storage.
Individual recording without transcoding: This mode is used when the original audio encoding parameters must be preserved to maintain the sound quality and performance. It is often used in use-cases with high demands for real-time performance and original sound quality, such as high-fidelity audio recording.
Feature | Individual recording with transcoding | Individual recording without transcoding |
---|---|---|
Transcoding during audio encoding | Yes | Yes |
Raw audio data | The sampling rate, number of channels, and bitrate are fixed at 48 kHz, mono, and 48 Kbps respectively. | The bitrate, sampling rate and number of channels are determined by the audio encoding parameters of the streaming end AudioProfile . |
Audio encoding format | LC-AAC | Determined by the configuration of the source end AudioProfile |
Generated recording files | Each UID generates an audio file in M3U8 format and multiple audio files in TS format. | Same as transcoding recording. If the user stops streaming using muteLocalAudioStream or leaveChannel audio recording stops immediately, and there is no 15 seconds of silent data. |
Player compatibility | The recorded file can be played by any mainstream player that supports the HLS protocol. | The audio encoding format is determined by the configuration of the streaming end AudioProfile . Different audio encoding formats have different compatibility. |
Delayed transcoding
Delayed transcoding is designed for audio-only recording use-cases. When you enable this mode, the recording service merges and transcodes the audio files of all users in the specified channel into an MP3
, M4A
, or AAC
file within 24 hours after the recording ends (or up to 48 hours in special cases) and uploads it to the specified third-party cloud storage.
Delayed audio mixing
Delayed audio mixing is used for individual audio recording use-cases. To obtain a mixed recording file of all users in the channel after recording, you enable the delayed audio mixing feature when starting individual audio recording without transcoding. Once enabled, the recording service merges and transcodes the audio files of all users in the specified channel into an MP3
, M4A
, or AAC
file within 24 hours after the recording is complete (or up to 48 hours in special cases) and uploads it to the specified third-party cloud storage.
Slicing
Slicing involves cutting audio and video data according to specific rules during the recording process to generate multiple recording files. After slicing, several slice files (such as TS
or WebM
files) are created, along with M3U8
files that store the indexes of these slice files.