Generation can be driven by text or images, supports three-image compositions, automatic transitions between start and end frames, and strengthens prompt adherence and the native sound effects experience.
High-quality generation driven by multiple assets
Supports up to three-image combinations and Frames-to-Video (start-to-end frame transitions), making it well-suited for product highlight reels and brand short films.
More accurately understands shot, style, and narrative instructions, reducing off-topic drift, and is better suited for repeated iteration toward a usable version.
Supports both text-to-video and image-to-video with reference images; supports multi-image composition generation, suitable for integrating brand elements and creating series content.
Connect according to the required function (text, image, and reference generation).
Prompt Input
In the node input field, enter the scene you want.
Generate Results
Click the Generate button and wait a short while to see the generated results.
FAQ
How should prompts be written to make it easier to get a usable result?
Use the structure 'scene subject + action/event + camera language (shot size/movement) + style/image quality + atmosphere/lighting + duration/aspect ratio (if there are options),' and keep key constraints consistent.
What kind of content is multi-image reference generation suitable for?
Suitable for feeding the model elements like 'logos/product images/scene mood images' together to quickly get a first draft of an ad short film, opening atmosphere, or product showcase video with unified visuals.