Clone Video Avatar
It is critical that you closely follow the instructions when capturing the footage for your digital avatar. Even the smallest error can lead to an inadequate experience. The entire recording process will take approximately 15 minutes of your time.
Video Requirements
- Only the subject being recorded can speak during the video footage. No other voices can be captured.
- Ensure that footage is exported in 4k or 1080p quality.
- Capture the footage landscape and position yourself in the center of the frame.
- Make sure the footage is shot at eye level and that the camera is stable throughout the shooting process. Your face must be evenly lit.
- We recommend shooting at least two 4-minute videos as your avatar footage.
- The file must be in .mp4 or webm format.
- File size must be under 750MB.
Speaking Details
We suggest filming the first 10-15 seconds with you naturally looking into the camera in silence. This will train the Clone avatar on how to act when not speaking.
Then choose a topic for a 4-minute, continuous impromptu speech, avoiding repetition of phrases and numbers. Maintain a normal speaking pace, neither too fast nor too slow.
Important – Ensure that you pause and close your mouth naturally for 1-2 seconds after each speech segment. This can be done after every couple sentences.
It’s okay if you make mistakes or mispronounce words. The purpose of speaking is to capture the data related to lip movements.
Body Movements
Keep your head stable. Avoid walking or making exaggerated head movements. Don’t place your hands above the chest area. There can be slight natural hand movement, just ensure that they do not cover your face at any time.
We suggest filming the avatar while seated. Feel free to stand if you prefer. However avatars generated from a seated position currently feel the most natural.
Do not turn away from the camera. We will not be able to train your avatar if we lose sight of your mouth at any point.
Facial Expressions
Maintain a slightly pleasant smile during the shooting process. If you record the source material with a smiling expression, you will receive an avatar that appears friendly and approachable. Conversely, if you record with a serious expression, you will receive an avatar that appears more serious in nature. Avoid yawning, touching the face, or bursts of laughter.
Background
We suggest you film in a natural setting. It can be an office, studio or part of your home. Don’t film in front of a background that is busy or moving (i.e. car traffic behind you). Avoid wearing clothes that blend into the background.
We don’t recommend shooting on a green screen. We do not have the capabilities to replace backgrounds at this time.
Required Consent Statements
The first line that must be communicated at the beginning of the video: “I give Delphi consent to use this video to create my avatar.”
If there is no video of the person appearing on camera providing authorization, we will not be able to proceed with the studio avatar production process.
Equipment Options
Camera – Must be shot in 4k or 1080p quality. Can be done with a professional full-frame camera (SONY FX6, Red KOMODO, etc.) or with your smartphone/laptop.
Audio – Do not use Bluetooth headphones (e.g., AirPods) for recording audio during avatar footage.
Lighting – Ensure that the space is well lit. Utilize a three-point lighting technique if you have access to lighting equipment.
Camera Tripod – Make sure the device is mounted on a tripod at eye level to maintain stillness.
Was this page helpful?