Unlocking the Spatial Age: How NeRF Technology is Redefining Our Reality, Memory, and Experience

Benjamin Glasser

Mario Santanilla

Apr 25, 2023

Email
Copy Link
Twitter
Linkedin
Reddit
Whatsapp
Spatial Age of Media
Spatial Age of Media

Franc Lucent shared a NeRF recording of his parents looking at a painting in a Museum. In the description Lucent mentions how during the recording, a museum guard approached him and said, “You can’t take videos in here.” To which Lucent replied, “Oh, I’m not taking a video; it’s a 3D photo…”. What’s most interesting about this encounter is the response he received afterwards "Well that's ok then!" 

https://twitter.com/franclucent/status/1590559140598153216

In recent months, we've seen an influx of posts on social media featuring NeRF recordings, as well as AI tools. These new tools and technologies are shifting the goalposts on what reality, memory, and experience are. NeRFs, with the help of these new technologies, expand the limits of human vision to live and remember through new experiences. We got excited to explore its ability to generate a 3D representation of an object, including its shape, texture, lighting, and surface detail. While comparing the current state of the art with the history of the image, we came to a handful of conclusions and speculations on where this technology might lead while questioning what happens when these advancements get integrated into our smartphones. Would this rethink the notion of a selfie? Should we still call these devices phones? 

When the original photographs were being created, its process could take hours, but as time passed and technology improved, cameras got smaller, faster, and more precise. Nowadays, we have cameras in our pockets that can take high-quality photos and even record videos at 4k resolution at 60 images per second. This has made photography accessible to everyone and has changed the way we interact with images, especially with the rise of social media. As we look to the future, the next frontier will inevitably integrate the third dimension, which is being explored in several industries through XR. This means that we'll be able to capture objects in space and create more realistic and immersive experiences. NeRFs give us insight into the ways in which this dimensional and temporal expansion may manifest.

Throughout this history of the image and 3D modeling, we see commonalities and repetitions which gives us a point of departure to imagine what this technology has done and will continue to do generationally. To point out some crucial trends, we see:

  1. Image production trends towards a desire for real-time captures and processing.

  2. A pursuit for higher fidelity.

  3. A desire to gain access to spatial information within a capture to achieve various needs.

Currently, cameras are highly specialized and not intended for NeRFs which contributes to the current speed in which they are created. DSLRs are useful for a controlled depth of field, GoPros are durable and small for the purpose of capturing action, RED cameras dominate the cinema market, and smartphones are the ultimate convenience. It would be no surprise if a camera specific for recording NeRFs is made in the future. 

This past year saw the rise and fall of a product that enabled a glimpse into what a NeRF camera could be. Snap Inc. released a new drone called Pixy. This drone is actually a flying camera that functions as a moving tripod that can generate different cinematic portraits. One of the options it has is to fly around the subject it is capturing. This drone can recognize faces and basic surrounding elements in order to fix the subject in the center and do a 360 fly around. This hardware would be useful for capturing a NeRF, not only because it rotates in a more precise circle around the subject but also because it focuses on both the subject and the environment surrounding it, generating a high-definition record of the entire scene. Even though Snap shut down future development and production of Pixy, this type of flying artifact is starting to rise in different industries. Polestar, the car manufacturing company launched their O2 Prototype, which features a drone in the back section of the car that can automatically fly around the car and in specific directions so as to aid the driver in the journey and record the travel in a very cinematic way. 

Furthermore, it may be time to redefine our current paradigm. The implications of having these devices embedded into everyday objects is that a growing database that spatially maps our surrounding world is growing. Once it passes a certain threshold, tools can then be easily built on top of its foundation. Just as the seemingly infinite storage of data and images eventually gave rise to the information age (culminating most recently in the advent of AI tools like chatGPT), we may be entering into a new paradigm, the spatial age, where access to high fidelity virtual spatialization of all surroundings in real-time will usher in a new foundation for technological advancement. 

What happens when we are able to capture spatial moments in real-time? Once the frictions of technological limitations are lifted and distributed amongst masses, how will this affect future generations? Undoubtedly, the contemporary age was drastically defined by the ability to capture moments in real-time and share two-dimensional representations. What if these representations aren’t static but in fact live spaces we can inhabit? What happens when we unleash this third dimension upon the next generation to experience or re-experience immersive moments spatially?

To organize our explorations of this emerging future, we isolated various areas of interest by speculating a virtual environment with four distinct portals that the user can inhabit in XR. We decided to use portals as our spatial layout because the crossing of a threshold enables the juxtaposition of different experiences in the same physical space. 

Portal 1: A moment captured and adapted fully with NeRF technology.

For this scene, we captured a bench outside of our office so that in the future, we both can visit the time and place in which we started collaborating on this exact article. 

While installing all the libraries and repositories to produce this NeRF we started to think of all the memories and anecdotes one could capture and come back to. We started to see how each NeRF tool enabled us to focus on different aspects of the scene. 

However, while creating this space, it highlighted two key shortcomings. The first was the capture and render times. The performance of taking NeRFs can be narrowed down to about a minute but this drastically limits the types of scenes that can be captured. Any moving objects will be blurred in the final outcome and, therefore, a single moment cannot actually be captured. Currently, the scene must be constructed and directed so as to create a reasonably desired outcome. This limits the potential to truly capture a moment in 3D at the time of writing this article. 

Furthermore, to get our scene into Unity for interacting within XR, we performed the meshing algorithm to turn the volume information into hard surface data. Currently, all of the state of the art libraries are limited on this process which drastically lowers the quality (and thus immersion) of the scene to be experienced. 

Portal 2: A space generated by a real-time video

For this space the user enters into a recording of a museum and is able to walk around the main space exploring the different dinosaurs displayed. 

Although our prototype is limited to a 3D video, it isn’t a large leap to speculate the implications of a virtual space that is constantly updating with real-time data. Instead of calling ahead to your restaurant of choice, you can visit it directly to see how busy it is before leaving for dinner, or the restaurant can send you a VR experience of their space when you receive a delivery of their food. Alternatively, if you left some notes on your desk at work, just head into the virtual office space to retrieve them. 

These types of experiences will only be possible once capture and processing times are drastically decreased. Currently, there are no real time NeRFs that are able to be created. However, there are a handful of experiments where people take multiple NeRFs to create a prototype of a moving scene. This can give insight into what a 3D moving scene might look and feel like although the time to create these is far from instant. 

Future Portals

Portal 3: An adaptive environment

This space allows for the exploration of novel design tools and language to emerge. If 3D environments will become more widespread, then our design systems must adapt accordingly. In this space the experience centers around a dynamic billboard of information that updates as the user gets closer. It starts as a fully sized billboard when the user is at a large distance and gradually shrinks down to the size of a pamphlet as the user approaches it. Each time the information and design system adapts as more granular details appear. 

In this realm, there have been apps like typespace which are popping up to build upon this foundation. Responsive design is integral to modern web applications where CSS styling adapts depending on the screen. However, the type in this app is fixed and only changes when the user pulls open the design tools. We want to explore the capabilities these new technologies bring to a dynamic and ever changing world. This is why we see there are many variables in mixed reality, and thus there is no reason that an intelligent design language can’t emerge that is responsive to lighting conditions, variable backgrounds, distance, color, weather, etc.

These types of tools can only be implemented once a proper spatialization of their environment is created. Therefore, this exploration plays with the possibilities of dynamic design systems once spatialized 3D information is more easily and widely accessible. Current systems of spatialization for XR are crude and limited but will inherently get better with time. 

Portal 4: A space completely generated by modern diffusion models and AI

Cameras, as we’ve seen in recent years, have become complex devices with their own agency powered by large datasets. What if our cameras are the creators? For example this NeRF by Lucent exemplifies the possibilities of integrating diffusion models to generate new experiences. What if instead of tourists flocking in masses to the Louvre to see the same Mona Lisa they can see in better quality and lighting on the internet, maybe each individual can have their own truly unique experience consisting something that’s never been seen by any other human eyes. 

This space would be fully created artificially with diffusion models and provide novel ways for us to experience spaces. For this technology to exist, NeRF recording and integration with other tools will have to be much more advanced. It will have to be improved to the point where real-time captures and processing are instantaneous, the output is of higher fidelity and it is responsive to the environment it is displayed on.  

Furthermore, the way in which 3D scenes are created in programs like Blender –starting the default scene with a cube– will inevitably change with current AI technology. Likely, this will integrate text fields as the default starting point for a project where a prompt input will generate novel scenes from where to start a project. Allowing for the creation of 3D spaces from a simple prompt in a text field. Maybe the input prompt is the emerging camera typology for NeRF technologies?

These speculations provide a jumping off point to begin a conversation about how our spatial future may emerge. It is likely that it will not be limited to the insights provided in this article but rather an amalgamation of them all with the addition of many that we could not have imagined. Independent from how it will take form, it is integral to begin to define our language for how it might be designed. We begin this conversation here and look forward to exploring what these new tools will allow for as this emerging spatial age begins to take shape.

Featured

Recents

Featured

Platforms

GSOPs 2.0: Now Commercially Viable with Houdini Commercial License

The 2.0 release for GSOPs is here with a commercial license!

Michael Rubloff

Dec 20, 2024

Platforms

GSOPs 2.0: Now Commercially Viable with Houdini Commercial License

The 2.0 release for GSOPs is here with a commercial license!

Michael Rubloff

Dec 20, 2024

Platforms

GSOPs 2.0: Now Commercially Viable with Houdini Commercial License

The 2.0 release for GSOPs is here with a commercial license!

Michael Rubloff

Platforms

Odyssey Announces Generative World Model, Explorer

Odyssey shows off their photo real world generator, powered by Radiance Fields.

Michael Rubloff

Dec 18, 2024

Platforms

Odyssey Announces Generative World Model, Explorer

Odyssey shows off their photo real world generator, powered by Radiance Fields.

Michael Rubloff

Dec 18, 2024

Platforms

Odyssey Announces Generative World Model, Explorer

Odyssey shows off their photo real world generator, powered by Radiance Fields.

Michael Rubloff

Platforms

PICO Splat for Unreal Engine Plugin

The Unreal Engine plugin for Pico headsets has been released in beta.

Michael Rubloff

Dec 13, 2024

Platforms

PICO Splat for Unreal Engine Plugin

The Unreal Engine plugin for Pico headsets has been released in beta.

Michael Rubloff

Dec 13, 2024

Platforms

PICO Splat for Unreal Engine Plugin

The Unreal Engine plugin for Pico headsets has been released in beta.

Michael Rubloff

Research

HLOC + GLOMAP Repo

A GitHub repo from Pablo Vela has integrated GLOMAP with HLOC.

Michael Rubloff

Dec 10, 2024

Research

HLOC + GLOMAP Repo

A GitHub repo from Pablo Vela has integrated GLOMAP with HLOC.

Michael Rubloff

Dec 10, 2024

Research

HLOC + GLOMAP Repo

A GitHub repo from Pablo Vela has integrated GLOMAP with HLOC.

Michael Rubloff