`SANA` expands into image, video, and controllable world-model generation

A research repo has become a full training/inference stack for high-res media. Useful for custom pipelines, but heavy for quick SaaS integration.

[ KEY POINTS ]

SANA-WM adds 720p, 1-minute video with 6-DoF camera control; strong for simulation and controllable scene generation ideas.
SANA-Video supports text-to-video and image-to-video, with LTX-VAE and LTX2 Refiner paths up to 2K output.
SGLang support exposes high-performance serving through an OpenAI-compatible API, making product integration less painful.
ComfyUI, Hugging Face, diffusers, and training recipes are all mentioned; the repo is closer to a platform than a single model drop.

Originalgithub.com/NVlabs/SanaRead original →

// related

#0001
#0001Generative media r/LocalLLaMA16 hours ago
PrismML ships `Bonsai Image 4B` as 1-bit and ternary image models
60radar
Bonsai Image 4BImage generation model — 1-bit/ternary browser runtime
A text-to-image model small enough for browser-side WebGPU changes the demo and prototyping surface. Worth testing for local-first image features; quality still needs validation.
- Weights are about 3GB, versus roughly 16GB for FLUX.2 Klein 4B; distribution and local demos get much lighter.
- Binary and Ternary variants target extreme quantization, so the real test is prompt quality, latency, and failure modes.
- Apache-2.0 licensing lowers commercial friction. Browser-local generation can reduce API cost and privacy concerns for small tools.
- The Hugging Face WebGPU demo matters because no backend GPU is required for first-contact testing.
Source: www.reddit.com/r/LocalLLaMA/comments/1togflk/prismml_jusRead original →
FIG-0011:1
60radar
FIG-0011:1
#0002
#0002Generative media GeekNews6 days ago
`OpenShorts`, Free Open-Source Clip Generator for AI UGC Videos
70radar
OpenShortsOpen-source video tool — turns long videos into vertical shorts
Long videos can be turned into vertical shorts without paying another SaaS bill. The self-hosted setup fits repeatable TikTok, Reels, and YouTube Shorts pipelines.
- Self-hosted and open source means lower marginal cost than hosted clip tools once video volume grows.
- Targets TikTok, Reels, and YouTube Shorts, so the output format maps directly to the main short-form channels.
- Clip Generator converts long-form video into 9:16 shorts with moment selection and face tracking, reducing manual edit time.
- Three tools are bundled into one workflow. The useful test is whether it can replace separate clipping, reframing, and UGC-generation steps.
Source: news.hada.io/topic?id=29715Read original →
FIG-0021:1
70radar
FIG-0021:1
#0003
#0003Generative media r/ClaudeAI7 days ago
`Remotion` + `Claude Code` launch-video workflow, no editor required
50radar
RemotionReact video framework — renders JSX to MP4
Launch videos can be built like React pages, then rendered to MP4. Cheap, repeatable, and useful when you lack design tools.
- Remotion turns JSX into MP4, so Claude Code can generate scenes and animation logic using familiar React patterns.
- The repeatable motion stack is simple: crossfades, one easing curve, grain, vignette, and restrained SFX.
- The practical bar is editing discipline. Kill any scene that does not earn attention within 3 seconds.
Source: www.reddit.com/r/ClaudeAI/comments/1tik0qe/coffee_claudeRead original →
50radar
PHOTO
FIG-0031:1

`SANA` expands into image, video, and controllable world-model generation

// related

PrismML ships `Bonsai Image 4B` as 1-bit and ternary image models

`OpenShorts`, Free Open-Source Clip Generator for AI UGC Videos

`Remotion` + `Claude Code` launch-video workflow, no editor required