Skip to main content
Android
iOS
macOS
Web
Windows
Electron
Flutter
React Native
React JS
Unity
Unreal Engine
Unreal (Blueprint)

Facial data capture

The Facial Capture extension collects quantitative data such as facial feature points, head rotation, and head translation. You can use this data to drive 3D facial stickers, headgear, pendant applications, or digital humans, adding more vivid expressions to virtual images.

This guide is intended for use-cases where the facial capture extensions is used independently to capture facial data, while a third-party rendering engine is used to animate a virtual human.

Information

To avoid collecting facial data through callbacks and building your own collection, encoding, and transmission framework, use the Agora MetaKit extension for facial capture.

Prerequisites

Ensure that you have:

  • Integrated Video SDK version 4.3.0, including the face capture extension dynamic library AgoraFaceCaptureExtension.xcframework.
  • Obtained obtained the face capture authentication parameters authentication_information and company_id by contacting Agora technical support.

Implement the logic

This section shows you how to integrate the Facial Capture extension in your app to capture facial data.

Enable the extension

To enable the Facial Capture extension, call enableExtension:

agoraKit.enableExtension(
withVendor: "agora_video_filters_face_capture", extension: "face_capture", enabled: true,
sourceType: AgoraMediaSourceType.primaryCamera)
Copy
Info

When you enable the Facial Capture extension for the first time, a delay may occur. To ensure smooth operation during a session, call enableExtension before joining a channel.

Set authentication parameters

To ensure that the extension functions properly, call setExtensionProperty to pass the necessary authentication parameters.

agoraKit.setExtensionProperty(
withVendor: "agora_video_filters_face_capture", extension: "face_capture",
key: "authentication_information", value: "{\"company_id\":\"xxxxx\", \"license\":\"xxxxx\"}",
sourceType: AgoraMediaSourceType.primaryCamera)
Copy

Retrieve facial data

To obtain raw video data containing facial information, use the onCaptureVideoFrame callback:

func onCapture(_ videoFrame: AgoraOutputVideoFrame, sourceType: AgoraVideoSourceType) -> Bool {
if !videoFrame.metaInfo.isEmpty {
if videoFrame.metaInfo.contains(where: { $0.key as! String == "KEY_FACE_CAPTURE" }) {
if let face_info = videoFrame.metaInfo["KEY_FACE_CAPTURE"] as? String {
print("Face Info: \(face_info)")
}
}
}
return true
}
Copy
Important

Currently, the facial capture function outputs data for only one face at a time. After the callback is triggered, you must allocate memory separately to store the facial data and process it in a separate thread. Otherwise, the raw data callback may lead to frame loss.

Use facial information to drive virtual humans

The output facial data is in JSON format and includes quantitative information such as facial feature points, head rotation, and head translation. This data follows the Blend Shape (BS) format in compliance with the ARKit standard. You can use a third-party 3D rendering engine to further process the BS data. The key elements are:

  • faces: An array of objects, each representing recognized facial information for one face.
    • detected: A float representing the confidence level of face recognition, ranging from 0.0 to 1.0.
    • blendshapes: An object containing the face capture coefficients. The keys follow the ARKit standard, with each key-value pair representing a blendshape coefficient, where the value is a float between 0.0 and 1.0.
    • rotation: An array of objects representing head rotation. It contains three key-value pairs. All values are floating points between -180.0 and 180.0.
      • pitch: The pitch angle of the head. Positive values represent head lowering, negative values represent head raising.
      • yaw: The angle of head rotation. Positive values represent left rotation, negative values represent right rotation.
      • roll: The tilt angle of the head. Positive values represent right tilt, negative values represent left tilt.
  • translation: An object representing head translation, with three key-value pairs: x, y, and z. The values are floats between 0.0 and 1.0.
  • faceState: An integer indicating the current face capture control state:
    • 0: The algorithm is in surface capture control.
    • 1: The algorithm control returns to the center.
    • 2: The algorithm is restored and not in control.
  • timestamp: A string representing the output result's timestamp, in milliseconds.

This data can be used to animate virtual humans by applying the blendshape coefficients and head movement data to a 3D model.

{
"faces": [{
"detected": 0.98,
"blendshapes": {
"eyeBlinkLeft": 0.9, "eyeLookDownLeft": 0.0, "eyeLookInLeft": 0.0, "eyeLookOutLeft": 0.0, "eyeLookUpLeft": 0.0,
"eyeSquintLeft": 0.0, "eyeWideLeft": 0.0, "eyeBlinkRight": 0.0, "eyeLookDownRight": 0.0, "eyeLookInRight": 0.0,
"eyeLookOutRight": 0.0, "eyeLookUpRight": 0.0, "eyeSquintRight": 0.0, "eyeWideRight": 0.0, "jawForward": 0.0,
"jawLeft": 0.0, "jawRight": 0.0, "jawOpen": 0.0, "mouthClose": 0.0, "mouthFunnel": 0.0, "mouthPucker": 0.0,
"mouthLeft": 0.0, "mouthRight": 0.0, "mouthSmileLeft": 0.0, "mouthSmileRight": 0.0, "mouthFrownLeft": 0.0,
"mouthFrownRight": 0.0, "mouthDimpleLeft": 0.0, "mouthDimpleRight": 0.0, "mouthStretchLeft": 0.0, "mouthStretchRight": 0.0,
"mouthRollLower": 0.0, "mouthRollUpper": 0.0, "mouthShrugLower": 0.0, "mouthShrugUpper": 0.0, "mouthPressLeft": 0.0,
"mouthPressRight": 0.0, "mouthLowerDownLeft": 0.0, "mouthLowerDownRight": 0.0, "mouthUpperUpLeft": 0.0, "mouthUpperUpRight": 0.0,
"browDownLeft": 0.0, "browDownRight": 0.0, "browInnerUp": 0.0, "browOuterUpLeft": 0.0, "browOuterUpRight": 0.0,
"cheekPuff": 0.0, "cheekSquintLeft": 0.0, "cheekSquintRight": 0.0, "noseSneerLeft": 0.0, "noseSneerRight": 0.0,
"tongueOut": 0.0
},
"rotation": {"pitch": 30.0, "yaw": 25.5, "roll": -15.5},
"rotationMatrix": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0],
"translation": {"x": 0.5, "y": 0.3, "z": 0.5},
"faceState": 1
}],
"timestamp": "654879876546"
}
Copy

Disable the extension

To disable the Facial Capture extension, call enableExtension:

agoraKit.enableExtension(
withVendor: "agora_video_filters_face_capture", extension: "face_capture", enabled: false,
sourceType: AgoraMediaSourceType.primaryCamera)
Copy

Reference

This section contains content that completes the information in this page, or points you to documentation that explains other aspects to this product.

Sample project

Agora provides open source sample projects on GitHub for your reference. Download or view the Face Capture for a more detailed example.

API reference

vundefined