AvatarFX is a cutting-edge AI platform for interactive storytelling, empowering users to bring characters to life by simply uploading an image and selecting a voice. Instantly, characters speak, move, and emote with remarkable realism and fluidity.
At the heart of AvatarFX lies our SOTA DiT-based diffusion video generation model, which is trained on a curated dataset, and optimized with novel audio conditioning, distillation, and inference strategies. This enables the creation of high-fidelity, temporally consistent videos at impressive speeds, across longer sequences, even with multiple speakers, multiple turns!
Coupled with our in-house developed audio capabilities with diverse options, AvatarFX delivers a truly immersive and interactive storytelling experience like no other.
Read more about how we built AvatarFX -- and about how we've incorporated robust safety controls into the tool -- at our blog here.
Try it on c.ai
All videos can be generated by one starting image and an audio clip.
Multiple characters, multiple turns will be supported by avatar-fx.