原創

萬 2.1 幻想對話-音頻驅動程式-KJ

最后更新:2025-11-19

wan_fantasytalking: an audio‑driven video generation model for lip‑synced digital humans. Given a single portrait image plus an audio clip, it produces a high‑fidelity talking video with strict lip synchronization and natural head motion and facial expressions, emphasizing identity consistency and temporal coherence.


Input/Output: single portrait + audio → talking video; focuses on three aspects: lip‑sync accuracy, identity preservation, and natural motion/expressions.


Lip‑sync and temporal modeling: uses audio features (e.g., speech, phonemes, visemes) to drive the mouth and facial regions, jointly coupling head motion and expressions to avoid the “lips‑only” uncanny effect.

一键翻译
節點预览 23 nodes
全屏
點擊加载節點预览
運行 (116)
收藏 (6)
下载 (1)
分享
工作流详情
類型
工作流
评分
5
发布時間
2025-10-14
狀态
可運行
節點信息 (23)
评論
0/400
共 0 条评論