Alibaba unveils speech-to-video model in Wan2.2

By Jake Bickerton2025-09-02T15:33:00+01:00

No comments

It converts portrait photos into avatars capable of speaking, singing, and performing

Screenshot 2025-09-02 at 15.57.33

Alibaba has unveiled Wan2.2-S2V (Speech-to-Video), which converts portrait photos into avatars capable of speaking, singing, and performing.

Wan2.2-S2V is part of Alibaba’s Wan2.2 video generation series, and can generate high-quality animated videos from a single image and an audio clip.

It offers character animation capabilities including portrait, bust, and full-body perspectives and can generate character actions and environmental factors dynamically based on prompt instructions.

Alibaba says Wan2.2-S2V is powered by advanced audio-driven animation technology to provide lifelike character performances, ranging from natural dialogue to musical performances. It can also handle multiple characters within a scene.

The avatars include cartoon, animals and stylised characters.

Wan2.2-S2V combines text-guided global motion control with audio-driven fine-grained local movements to enable natural and expressive character performances across complex and challenging scenarios, says Alibaba.

Alibaba’s Wan2.2 is made up of open-source large video generation models incorporating the MoE (Mixture-of-Experts) architecture, which significantly elevates the production of cinematic-style videos with a single click, says the company.

The series includes a text-to-video model; an image-to-video model; and a hybrid model that supports both text-to-video and image-to-video generation tasks within a single framework.

Topics

No comments

News
Adobe partners with YouTube & Google Cloud

2025-10-29T15:54:00Z By Max Miller

A YouTube Shorts creation space will be added to Premiere mobile, while Adobe and Google Cloud will push AI
News
Adobe reveals AI-powered video editor & GenAI tools

2025-10-29T15:35:00Z By Max Miller

Video editor unveiled alongside Generate Speech and Generate Soundtrack tools
Comment
AI should be seen as an asset for sport broadcasters, not an enemy

2025-10-29T12:50:00Z

Pawel Osterreicher, CEO of ReSpo.Vision, looks at where AI could make a difference