World Labs Announces RealTime Frame Model (RTFM)

World Labs Announces RealTime Frame Model (RTFM)

World Labs Announces RealTime Frame Model (RTFM)

Michael Rubloff

Michael Rubloff

Oct 16, 2025

Email
Copy Link
Twitter
Linkedin
Reddit
Whatsapp
World Labs
World Labs

World Labs has introduced its latest research preview in spatial AI, RTFM (Real-Time Frame Model), a system capable of generating and rendering video in real time as users explore virtual spaces, powered by an H100. RTFM hints at what the next era of world models could look like: persistent, interactive, and visually coherent environments that exist and evolve continuously.

To be clear, World Labs is not using gaussian splatting or any of the radiance field representations, but an autoregressive diffusion transformer, capable of spatial memory and generates new frames directly from prior frames without needing explicit geometric representations.. RTFM achieves interactive framerates on a single NVIDIA H100 GPU, which is a technical milestone considering how resource intensive real-time generative video typically is.

World Labs describes RTFM as blurring the line between reconstruction and generation. When given multiple input views, the model acts more like a reconstruction engine, interpolating between what it’s seen; with fewer inputs, it ventures into imagination, generating unseen perspectives. It’s a framework that learns how to fill gaps in space and time through exposure to massive amounts of video data.

A crucial advancement is RTFM’s approach to spatial memory. Each generated frame is positioned in 3D space, allowing the model to retrieve “nearby” frames when rendering new ones. World Labs calls this context juggling — a way for the system to recall relevant portions of a world efficiently without overloading computation. The result is a form of unbounded persistence: a world that doesn’t dissolve when the user looks away. This seemed to first catch the public's eye in Google's Genie previews just a couple months ago.

The implications go beyond rendering pretty scenes. RTFM signals an evolution in how we might experience AI-generated content — not just as pre-rendered clips, but as interactive, persistent worlds that can be explored indefinitely. World Labs’ emphasis on efficiency and scalability means that, while current deployments run on a single GPU, the architecture is built to scale with compute improvements, aligning with The Bitter Lesson.

The World Labs team lets you explore here.