World Models & Interactive Video

Open-source real-time AI video just leveled up.
Today, we’re publicly launching SDXL for StreamDiffusion, bringing the most advanced Stable Diffusion model into a highly controllable, open-source, real-time workflow.
This release delivers HD real-time video generation at over 15 FPS on the Daydream platform, with optimized configurations reaching up to 25 FPS.
And because it is fully open source, anyone can extend, remix, and build on it through the Daydream API or our StreamDiffusion fork.
Creators like DotSimulate are already building SDXL-powered tools and workflows on Daydream, showing what is possible when next-generation models meet real-time performance.
Our open-source stack combines multiple research tracks into one cohesive, production-ready system that enables creators to fine-tune every aspect of image quality, temporal stability, and style.
IPAdapters (Image-Prompt Adapters) let you guide your video’s look and feel using any reference image. They function similarly to LoRAs but allow real-time adjustability.
Two modes:
Technical features:
Accelerated HED, Depth, Pose, Tile, and Canny ControlNets give you precise control over spatial and compositional details. You can combine multiple ControlNets in one workflow and adjust the strength of each.
Prefer the classic SD1.5 model? We’ve paired it with accelerated IPAdapters for smooth, high-framerate style transfer.
Explore the Playground and our API docs, join the Daydream community on Discord, and start creating your own real-time AI video workflows today.
World Models & Interactive Video

This project file is a simple reaction diffusion network that gets transformed using SDXL into graffiti on a wall. Bold colors, and cool shapes! Press 1 to reset the reaction diffusion.

This fall, you can enjoy virtual pumpkin carving with StreamDiffusion — no matter your artistic abilities.

Highlights from the first-ever Realtime Video AI Summit hosted by Daydream, bringing together researchers, developers, and artists exploring the future of live, generative video.