It feels like it was just yesterday that I was talking about large tech companies utilizing NeRFs. That's probably because it was. Well another day, another multi-billion dollar tech company enters the NeRF fray. This time it's Santa Monica based camera application, Snap.
The feature is based upon Snap's paper featured at CVPR 2023, Real-Time Neural Light Field on Mobile Devices (MobileR2L), but was actually published in December of last year.
As almost all NeRF platforms recommend in one form or another, they have written out these suggestions:
You might be wondering how the Snap team is able to achieve this, as there's been great speculation about how NeRFs will be able to run in real time.
Their method utilizes a Teacher-Student Training (Knowledge Distillation) method. This particular iteration utilizes NVIDIA's Instant NGP, as the teacher. First, a larger, more powerful model (the "teacher") is trained. Then, a smaller model (the "student") is trained to mimic the teacher's outputs. The end goal is often to have a smaller model that performs nearly as well as the larger model but is more computationally efficient. This can be broken out into a few steps.
Training the Teacher Model
In this step, Instant-NGP is trained on the NeRF dataset. Once the teacher model is trained, it holds the knowledge of the dataset, i.e., it has learned the patterns, features, and other nuances of the data.
Generate Pseudo Data with the Teacher Model:
Once the teacher is trained, it's used to generate "pseudo data". This step might sound redundant, but there's a method to the madness. Here's why this is valuable:
Data Augmentation: Pseudo data can be thought of as an advanced form of data augmentation. Instead of just rotating, cropping, or applying color adjustments (traditional data augmentation techniques), the teacher model is creating entirely new samples based on what it has learned. This can enrich the dataset, potentially making the subsequent training process more robust.
Knowledge Transfer: By generating pseudo data, the teacher model is indirectly transferring its learned knowledge. When the student model is trained on this data, it's like the student is getting hints or lessons directly from the teacher.
Tailored Data for Mobile Models: The generated pseudo data may contain features or patterns that are especially useful for training the smaller, mobile-friendly student model. It bridges the gap between the high-capacity teacher model and the lightweight student model, providing the latter with data that is tailored to its needs.
Training the MobileR2L (Student Model)
Using Pseudo Data and Original Data:
When training the student model (in this case, MobileR2L), it will likely benefit from both the original NeRF dataset and the pseudo data generated by the teacher model.
Training on this combined dataset allows the student model to leverage the raw information from the original dataset and the "wisdom" distilled into the pseudo data by the teacher model.
Step Training the MobileR2L (student model):
The documentation recommends running this step in a terminal using
tmux
, which is a terminal multiplexer. This is because the training may take a long time, and usingtmux
ensures that the training continues even if the connection to the terminal is interrupted.The
benchmarking_nerf.sh
script is executed to begin training.
Exporting the MobileR2L Model:
After training completes, the MobileR2L model is converted to ONNX format, which is a portable model format that can be used across different deep learning frameworks.
The documentation provides two ways to do this:
The model automatically exports the ONNX files when it converges.
Alternatively, you can manually run a provided script to perform the export.
In either case, you get three ONNX files: sampler
, embedder
, and model
. Once you have your three ONNX files, you are ready to deploy your Snap Lens. Instructions for that step are contained here.
Once you have completed that, you should be good to go! This seems like a no brainer for brands looking to market on Snap. Now companies can leverage NeRFs to create interactive experiences.One of the big highlights that Snap has chosen to focus on is virtual try ons and it makes a lot of sense given their demographic. But I believe there are so many more applications for users.
Snap represents the latest company to unveil a NeRF integration into their platform and I highly doubt they will be the last. As companies begin to understand and implement the technology, I believe that it will contribute as a sales driver and tool for users to interact with and get closer to brands.