AI Toolkit: Generating Avatars – Artlist

The Artlist AI Toolkit enables you to create AI-powered avatar videos. You can generate realistic talking avatars from images, synchronize speech with facial movement, and localize videos into multiple languages while preserving natural timing and expression.

This guide explains how to use the three main workflows:

Avatar video generation
Lip-sync animation
Video dubbing and translation

How to create an avatar video

An avatar is a digitally generated on-screen presenter or character that can speak, move, and interact like a real person in a video. Our Avatar models allow you to generate a realistic talking avatar from a single image and audio file.

Avatars are ideal for:

Presenter videos
Marketing content
Explainer videos
Training materials
Social media content
Product introductions

Select an Avatar AI model

From the top panel on the Homepage click AI Toolkit.
Click Video from the dropdown menu on the bottom of the prompt field.

Select the desired Avatar model.

Note: Avatar models are currently not available with the AI Agent.

Upload an image

Upload a clear image of the person or character you want to animate.

For best results:

Use a front-facing image
Make sure the face is clearly visible
Use high-resolution images when possible
Avoid blurry or heavily cropped photos
Keep lighting natural and even

Supported image types include JPG and PNG formats.

Add audio

Upload your audio file. You can choose from one of your voiceovers or upload a file from your computer.

Note: An audio file must be uploaded. Scripts are not supported in the prompt field.

💡Pro Tip: Make sure to perfect your voice before using it for an avatar to ensure it matches your character.

Customize video settings

Available settings include:

Aspect ratio
Resolution
Video duration

Add a prompt

Describe your character’s expressions and gestures.

Note: This is optional and model dependent.

Prompt box without prompt field

Prompt box with optional prompt field

Generate the avatar video

Review your settings.
Click Generate. The number of credits required to generate the video will be shown when you hover over the button.

Once completed, your avatar video will appear inside the session.

How to create a lip-sync video

The lip-sync AI models let you animate a face or existing visual so the mouth movements match a provided audio track or voiceover.

Lip-sync is useful for:

Character animation
Social media edits
AI presenters
Meme content

Select a lip-sync AI model

From the top panel on the Homepage click AI Toolkit.

Click Video from the dropdown menu on the bottom of the prompt field.

Select the desired lip-sync model.
- Lipsync v2 Pro

Note: Lip-sync models are currently not available with the AI Agent.

Upload your media

Upload:

A a video containing a visible face
An audio file or generated voiceover

Note: An audio file or generated voiceover must be uploaded. Scripts are not supported in the prompt field.

For best results:

Keep the face clearly visible
Use high-quality audio
Avoid overlapping voices and loud background noise

Customize video settings

Available settings include:

Aspect ratio
Resolution
Video duration

Generate the lip-sync video

Review your uploaded media and settings.
Click Generate. The number of credits required to generate the video will be shown when you hover over the button.
Wait for processing to complete.

The generated lip-sync video will appear in your session.

How to create a dubbed video

Translate existing videos into multiple languages while preserving natural lip sync and voice timing.

Dubbed videos are ideal for:

International marketing
Educational content
Creator localization
Business presentations
Training materials

Select a dubbing AI model

From the top panel on the Homepage click AI Toolkit.
Click Video from the dropdown menu on the bottom of the prompt field.

3. Select the desired dubbing model.

Note: Dubbing models are currently not available with the AI Agent.

Upload your source video

Upload the original video you want to dub or translate.

For best results:

Use videos with clear speech
Minimize background noise
Keep the speaker visible on screen
Use high-quality audio recordings

Note: The original video reference can be in any language. It is not limited to English.

Choose the target language

Select:

The target language for translation. There are 50+ languages to choose from.

Customize video settings

Available settings include:

Aspect ratio
Resolution
Video duration

Generate the dubbed video

Review your settings
Click Generate. The number of credits required to generate the video will be shown when you hover over the button.

When finished, the dubbed video will appear in your session workspace.

Tips for better avatar, lip-sync, and dubbing videos

To improve output quality:

Use clear, high-resolution media
Keep scripts concise and natural
Avoid fast speech and complicated phrasing
Use good lighting in uploaded videos
Reduce background noise for translation workflows
Match aspect ratio to your publishing platform

Need some more help? Head back to our Help Center.