I often liken NeRFs to magic, in that they can understand view dependent effects. That becomes a bit more literal with today's featured paper: Magic NeRF Lens.
Magic NeRF Lens combines NeRFs with the power of Computer Aided Design (CAD). This alone unlocks so many possibilities; it's exciting! Another major component of Magic NeRF Lens specifically operates in Virtual Reality (VR) and allows the user, to photorealistically walk through a space to provide maintenance checks, without stepping foot on location. This begins to dive into a vertical that really excites me for the future of NeRFs—construction and maintenance. There are such massive opportunities for companies to leverage NeRFs at large scale, while providing a better and cheaper solution than the alternative.
It makes me think of the amount of trash that is currently stuck in the US sewer system and potentially can provide some automated alleviation (which also doubles as a great band name). Yet again, I am drawn to the paper, Language Embedded Radiance Fields (LERF) and see this a a potential match made in automated heaven. Perhaps this can also be applied to remote oil rigs, that have been staffed with skeleton crews. Other potential use cases could be nuclear or otherwise radioactive power stations that are too dangerous for humans to enter, even with hazmat and protective suits. This latter example is the one that the paper uses.
they are often complex systems built after decades of planning and implemented at enormous economic cost, scientists and engineers today are actively seeking methods to effectively maintain existing facilities in order to maximize their operational lifetimes and economically upgrade them to new operational standards.
When you have facilities such as this, it becomes difficult and risky to have humans walking through and doing maintenance on site. Especially if no maintenance was necessary. If only there was a way to view these facilities in photorealistic 3D! Well that's why you're reading this, but it understates the depth of the problem. Getting accurate model information is not only critical, but also presents a massive safety risk. If the models generated are not high quality or misrepresent something, there's a lot at risk. The types of places that would benefit from Magic NeRF Lens are complicated structures with small details and important visuals.
Magic NeRF Lens is built on top of Nvidia's Instant-NGP and is all the more reason that Magic NeRF Lens is impressive. It is able to integrate with Computer Aided Design (CAD) files, to get the best of both NeRF and CAD through data fusion. CAD alone is not powerful enough to accurately get all the details that are necessary.
The Magic Lens system isn't new, presented back in 1993 with the paper “Toolglass and magic lenses: the see-through interface,” Magic Lens helps focus in on a target area, allowing a user to focus in on a tiny spot, while also preserving computational resources.
NeRFing an entire facility in one NeRF, while possible, seems like extreme overkill, when evaluating one spot to check up on. With that in mind: Magic NeRF Lens.
The paper points out that NeRF is not the only method to try and help with this use case. Both LiDAR and Augmented Reality have also generated their own solutions, but neither of them are able to achieve the photo-realistic visualization that comes with NeRF.
Another common argument is why not use photogrammetry for these tasks? It provides precise measurements and workflows have been in existence for decades. The drawback is that many of the places that would benefit the most from this are filled with metal, reflective surfaces, and shiny objects, all things that photogrammetry struggles with. Digital twins could also work, but NeRF has emerged with a faster output thanks to the groundbreaking paper "Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
As the saying goes "with great resolution, comes great VRAM requirements" and so Magic NeRF Lens uses two novel contextual volume data visualization and interaction techniques to balance the two. The data fusion process to combine NeRF with its associated CAD file is broken down into 6 main steps. The first three might be familiar to you.
1. Preprocessing: In this initial stage, 2D images are prepared for use in the 3D VR model. As most 2D images don't contain information about their camera poses, an algorithm called Structure from Motion (SfM) is used to estimate these poses. However, if cameras can track their own poses, this step can be skipped.
2. Training: The processed data is trained within the instant-ngp framework. This generates an estimate of the scene function and an initial occupancy grid. The occupancy grid is based on a preset axis-aligned bounding box (AABB).
3. Scene Cleaning: This step aims to enhance the quality of NeRF rendering by removing artifacts or errors in the real-world 2D images used, such as motion blur or lens distortion. These errors often result in inaccurate aspects of the 3D scene, impacting the viewing experience. Users can manually remove areas with prediction errors or low quality. The cleaned up NeRF model, including edited density grid and binary bitmask of the density grid, are stored for future use.
4. Scene Alignment: The user aligns the cleaned NeRF model with the CAD model using the NeRF object manipulation and crop box editing functions. This stage focuses on a small part of the scene for accurate object manipulation to avoid overwhelming the VR system with a large rendering volume.
5. Scene Merging: In the final stage, the user validates the alignment by adjusting the field of view (FoV) and transparency of the shader that renders the NeRF images. This allows for both the CAD drawing and the NeRF rendering to be simultaneously visible. The user can adjust the alignment and fusion until the two 3D models are spatially aligned, with the relative transformation matrix between the NeRF model and the CAD model saved for future use.
You might be thinking all of this in theory sounds great, but there are serious claims being made here. They take their theory to the test by utilizing the German Electron Synchrotron (DESY), where some of the world’s most advanced particle accelerators are built. With that in mind, the authors gave five employees access to Magic NeRF Lens and I've pulled some select feedback below:
“I think the system has a good advantage. It is quite nice to project the NeRF model on the CAD model, as it is a lot more effort to take laser scans of the facility"
“With this system, I see the possibility to test something in theory before you build it in practice. For example, when you have a machine, and you want to test if you have enough space for installing it, it is quite nice you could test everything in the virtual area before you do it in reality”
The code is already publicly available for the intrepid few that want to begin building and can be found here.
While this is a major step forwards to aligning NeRFs with the maintenance and sensitive locations, there are still challenges in the form of high resolution and high frame rate outputs. One way that they're looking to combat this is to utilize foveated rendering, but that as well comes with its own sacrifices.
While Virtual Production's use case of NeRFs has begun to dominate the conversation surrounding the use case of NeRFs, it's papers like this that really get me excited about the wide range of applications that NeRFs present. I believe that new products such as Luma's Flythrough's and Magic NeRF Lens will continue to emerge, disrupting and bringing with them better solutions to problems businesses face everyday.