
Michael Rubloff
Jun 26, 2025
AI start up Schemata has come out of stealth to announce a simple but audacious goal: connect reasoning models to radiance fields. Their platform adds 3D scene understanding on top of high fidelity radiance field models, with the intention of creating virtual training and simulation applications for defense and enterprise customers.
When continuous radiance fields burst onto the graphics stage in 2020, we finally had photorealistic 3D representations of the world. What we did not have, as CEO James Brown now jokes, was anything smarter than “gorgeous static scenes.” After watching those early NeRF demos in graduate school at Stanford, Brown teamed up with AI researcher Huy Nguyen to build what they think is the next phase of radiance fields. “Photorealistic 3D reconstruction is step one,” they told me last week. “Step two is letting people work with 3D data as freely as they already work with text or video, by letting them query it, edit it, and build entire applications on top of it.”

Schemata’s pipeline remains proprietary while the team races toward full scale production, yet the outline is clear. Capture video, reconstruct a high-fidelity model from it, and run an internal 3D scene understanding model that creates a hierarchy of objects, parts, and affordances that downstream applications or analytics packages can tap into in real time. Something that we have tracked here at radiancefields.com for a while is that the broader research arc supports the bet.
In 2020, NeRF proved we could compress photoreal geometry into a single network, but left the scene devoid of semantics. By 2022, CLIP-infused systems such as OpenScene and Language Embedded Radiance Fields (LERF) began wiring language into point clouds and radiance fields, letting researchers ask for “every red thing” in a living room with zero additional labels. And just last year, Group Anything Radiance Fields (GARField) used Segment-Anything to lift multiscale object groups out of 3D space, paving the road to editable, object-centric worlds. The direction is obvious: tomorrow’s 3D engines will merge rendering, reasoning, and language into one differentiable world model.
Schemata believes it can be the first commercial bridge to that future. Early contracts within military training pipelines show that photorealistic trainers built in a few days can massively decrease the time and cost for effective virtual training and simulation applications that the DoD spends $14B on every year. Early commercial pilots support the thesis that highly regulated industries like Manufacturing and Oil and Gas want those same efficiency gains.
Nguyen is already looking past virtual training. “We see a world where spatially aware AI agents move through real space with the same contextual fluency that large language models now enjoy in the browser,” he says. “Virtual training is just the first stop.” If that vision lands, Schemata can supply the missing glue between photoreal rendering and high level reasoning.