SkyReels V2 is an open‑source video diffusion model for long‑form/infinite‑extendable generation. It supports T2V and I2V, and emphasizes cross‑shot consistency and narrative coherence.
- Key technical mechanisms
- Diffusion Forcing Transformer: generates long sequences with a sliding window; conditions on the last few frames plus text to produce subsequent frames; injects slight noise into past frames to suppress temporal error accumulation and stabilize long‑horizon rollout.
- Serialized long‑video generation: achieves theoretically unlimited length via window rolling and conditional continuation (in practice already over 30 seconds).
- Modes and capabilities
- T2V (text‑to‑video), I2V (image‑to‑video); also supports E2V (element‑to‑video) with multi‑reference element composition.
- Suited for cinematic storytelling, multi‑shot camera movement, and long‑term temporal consistency scenarios.