`SANA-WM`, a 2.6B open-source world model for 1-minute 720p video
A single image plus a 6-DoF camera path can produce controlled long video on one GPU. Useful signal for product mockups and scene previews, but still closer to R&D than a plug-and-play SaaS feature.
- Input is one image + 6-DoF camera trajectory, so the value is controlled scene movement rather than text-to-video prompting.
Hybrid Linear Diffusion Transformermixes frame-levelGated DeltaNetwith periodic softmax to keep long rollouts coherent.- Single-GPU 720p 1-minute generation lowers experiment cost, but integration still depends on weights, license, and inference setup.