NVIDIA's GTC has been in full swing and I've been surprised how much of a presence radiance fields have played a the conference. There was even a NeRF in the background of one of Jensen's keynote slides from Jonathan Stephens!
Excitingly, we're seeing the arrival of the first result of another widely hyped event. The meeting of NeRFs and Gaussian Splatting.
Most recently from Google we saw the announcement of SMERF, which offered a middle ground of trained by Zip-NeRF, but still achieved real time rendering rates of 100+ fps.
Google returns intertwining NeRFs with the efficiency of Gaussian Splatting, propelling rendering speeds to exceed 900 fps while maintaining photorealistic quality. RadSplat bridges these worlds, additionally allowing for the usage of fisheye lenses, which Gaussian Splatting does not support.
The method starts out using NeRFs as a prior. As with several of Google's papers, they're using Zip-NeRF as the radiance field prior method. NeRFs provide a detailed, three-dimensional model of the scene, capturing intricate details such as lighting variations and texture complexity. This model acts as a blueprint, guiding the initialization and supervision process.
Beyond serving as a structural model, NeRFs are also used to supervise the optimization of point-based scene representations. This dual role of NeRFs ensures that the generated images retain a high degree of realism, closely matching the captured scenes.
Once that's been complete, they move onto integrating 3D Gaussian Splatting into the scene.
They also introduce something called an Importance Score, which essentially determines how important gaussians are during optimization and removes points that do not contribute significantly to a training view. If a gaussian doesn't meet a score of 0, it is pruned and this step is actually run twice. As you might imagine, by identifying and eliminating less significant points within the scene representation, RadSplat significantly lowers the overall gaussian count, leading to a leaner model that is both storage and computation-friendly.
This runs well for smaller scenes, but for tackling larger environments, they have a solution based upon the incorporation of input camera clustering and visibility filtering. They offer a scalable solution to the challenge of real-time rendering of large-scale scenes, without degrading the visual fidelity.
There's one additional trick they use called Visibility List-Based Rendering, which matches the closest camera cluster to the viewpoint that a user is currently looking at. Then they just use normal rasterization on only the parts that are relevant, leading to a huge speed boost.
To achieve its remarkable rendering speeds, RadSplat introduces an innovative test-time filtering strategy. The filtering mechanism adapts in real-time, based on the specific requirements of the scene being rendered, ensuring that only the most relevant points contribute to the final image.
All together, RadSplat is able to adeptly manage and render vast scenes with numerous points of interest and varying levels of detail, by ensuring that only the most pertinent data points are processed for each camera cluster. With the benefits of both combined, how does that translate? Well RadSplat renders almost 4 times faster than Gaussian Splatting and 1,000X faster than Zip-NeRF.
There is currently no code listed and I imagine that this will be submitted for conference consideration, so it might be a while before we see a public release of the code, but I do believe that we will see independent startups pulling inspiration from RadSplat before its release.
Despite its training time (roughly two hours), the strides RadSplat makes in rendering speed and efficiency—rendering nearly 4 times faster than Gaussian Splatting and 1,000 times more than Zip-NeRF, while drastically reducing the number of required gaussians—are monumental.
I often hear about the potential optimizations and benefits that will be coming to the world of radiance fields and it truly is amazing to witness it march towards that vision. How NeRFs and 3D Gaussian Splatting interact with one another in the future just received a massive vote of validation.