Research

DUSt3R: Simplifying Geometric 3D Vision

Michael Rubloff

Michael Rubloff

Mar 4, 2024

Email
Copy Link
Twitter
Linkedin
Reddit
Whatsapp
DUST3R
DUST3R

Researchers from Aalto University and Naver Labs Europe have introduced DUSt3R, a groundbreaking method for Dense Unconstrained Stereo 3D Reconstruction, propelling forward the field of geometric 3D vision.

Imagine trying to recreate the exact shape and texture of objects just from photographs, without knowing where the camera was when the photo was taken. DUSt3R makes this possible, enabling the extraction of complex geometric data from a collection of photographs of a scene that don't necessarily overlap. This leap in technology means that accurately mapping out 3D spaces from images is now easier and more accessible than ever before.

Understanding the camera's original location is pivotal for accurately generating radiance fields, a task at which traditional methods like COLMAP may falter. These methods often struggle to manage gaps in image sequences, a shortcoming that can result in reconstructions that are either incomplete or distorted. This limitation has spurred the quest for more robust solutions capable of adeptly navigating these gaps.

Radiance fields often depend on Structure from Motion (SfM) techniques, including those like COLMAP, to determine camera positions. However, DUSt3R diverges from these traditional methods by introducing a novel strategy within the realm of multi-view stereo reconstruction (MVS). This innovative approach enables DUSt3R to directly tackle the challenges associated with image sequence gaps. What sets DUSt3R apart within the MVS domain is not just its ability to address these issues but also the unique methodology it employs, distinguishing it from both SfM and conventional MVS techniques.

Traditionally, MVS processes have been hindered by the cumbersome task of estimating camera parameters. DUSt3R sidesteps this issue entirely by utilizing pointmaps derived from image pairs. This strategy allows DUSt3R to reconstruct a complete 3D model without needing to know the camera's specifics upfront, operating directly from the visual content of the images.

DUSt3R thrives on its unique capability to process a set of images (as few as two!), generating dense 3D pointmaps that encapsulate essential geometric information like camera positions, pixel alignments, and depth maps. This leads to a fully consistent 3D reconstruction. Its adaptability is remarkable, equally proficient in dealing with images taken from one camera (monocular) or two (binocular), thereby offering a comprehensive solution for reconstructing 3D spaces.

Unlike conventional MVS methods that depend on laboriously estimated camera parameters, DUSt3R innovates by regressing dense 3D pointmaps directly from pairs of images. This approach eliminates the need for prior knowledge about the camera's specifications, focusing instead on the inherent geometric information contained within the images themselves. By doing so, DUSt3R manages to achieve comprehensive 3D reconstructions with impressive accuracy and detail.

Central to DUSt3R's success is its reliance on a deep learning framework that incorporates Transformer encoders and decoders. This design choice leverages the power of pretrained models, significantly enhancing DUSt3R's capability to decipher and reconstruct complex 3D structures from a broad array of visual inputs. Upon processing pairs of RGB images, DUSt3R outputs pointmaps that meticulously map out the scene's 3D geometry.

A key innovation in DUSt3R's methodology is its approach to the 3D reconstruction problem as a regression of pointmaps, a strategy that circumvents the limitations of traditional projective camera models. This flexibility allows for a seamless integration of monocular and binocular reconstruction scenarios into a singular, cohesive framework—a notable advancement over existing methods.

For image collections that span more than two views, DUSt3R employs a straightforward yet effective global alignment strategy. This method aligns all pointmaps within a unified reference frame, ensuring the coherence and consistency of the 3D reconstruction across multiple viewpoints. Such a comprehensive perspective is crucial for producing high-fidelity reconstructions in complex scenes.

https://twitter.com/JeromeRevaud/status/1764035510236758096

Harnessing the advancements in deep learning, DUSt3R's architecture is built upon standard Transformer encoders and decoders, benefiting from the elaborate feature representations these models learn from extensive datasets. The pretraining phase is essential for equipping DUSt3R with the ability to accurately predict the 3D structure of scenes, overcoming challenges posed by varying conditions.

Extensive testing of DUSt3R across diverse 3D vision tasks has demonstrated its exceptional performance, particularly in monocular/multi-view depth estimation and relative pose estimation. By offering an integrated solution for 3D reconstruction from uncalibrated and unposed images, DUSt3R streamlines the process of geometric 3D vision, enhancing both its efficiency and accessibility.

The publication of DUSt3R's code has already spurred experimentation within the community, including applications like Gaussian Splatting.

https://twitter.com/janusch_patas/status/1764025964915302400

The license does allow people to share and adapt the code with attribution, though commercial use is currently not allowed. There hasn't been any indication if this is something that will be licensable for an additional fee. Their Github has instructions to install a demo or you can also try DUST3R on Replicate here. Additionally, a PR has been opened to allow for videos to be used in the pipeline and CocktailPeanut has already created a Pinokio instance to try it yourself.

As the first end-to-end pipeline of its kind, DUSt3R represents a monumental leap in computer vision technology, providing a more straightforward alternative to traditional methodologies. With its profound potential for application and its ability to significantly advance the field of 3D reconstruction, DUSt3R stands out as an extremely interesting contribution for the domain. Things are moving fast and as the project page states: DUSt3R makes geometric 3D vision tasks easy. It just might.


Featured

Featured

Featured

Platforms

OpenNeRF added to Nerfstudio

OpenNeRF is the latest method to be supported by Nerfstudio.

Michael Rubloff

May 24, 2024

Platforms

OpenNeRF added to Nerfstudio

OpenNeRF is the latest method to be supported by Nerfstudio.

Michael Rubloff

May 24, 2024

Platforms

OpenNeRF added to Nerfstudio

OpenNeRF is the latest method to be supported by Nerfstudio.

Michael Rubloff

Platforms

PlayCanvas's SuperSplat Updated with PWA support

PlayCanvas's Supersplat has continued to receive additional updates. This time it's coming with a big boost to performance, yet again.

Michael Rubloff

May 24, 2024

Platforms

PlayCanvas's SuperSplat Updated with PWA support

PlayCanvas's Supersplat has continued to receive additional updates. This time it's coming with a big boost to performance, yet again.

Michael Rubloff

May 24, 2024

Platforms

PlayCanvas's SuperSplat Updated with PWA support

PlayCanvas's Supersplat has continued to receive additional updates. This time it's coming with a big boost to performance, yet again.

Michael Rubloff

Research

Reflecting on NeRF-Casting

Late last year we looked at Uni-SDF which introduced dual radiance fields to better represent reflections in a scene. However, I just happened to see NeRF-Casting on Github a little while ago

Michael Rubloff

May 23, 2024

Research

Reflecting on NeRF-Casting

Late last year we looked at Uni-SDF which introduced dual radiance fields to better represent reflections in a scene. However, I just happened to see NeRF-Casting on Github a little while ago

Michael Rubloff

May 23, 2024

Research

Reflecting on NeRF-Casting

Late last year we looked at Uni-SDF which introduced dual radiance fields to better represent reflections in a scene. However, I just happened to see NeRF-Casting on Github a little while ago

Michael Rubloff

Platforms

Scaniverse arrives on Android

Gaussian Splatting platform, Scaniverse, is now available on Android.

Michael Rubloff

May 21, 2024

Platforms

Scaniverse arrives on Android

Gaussian Splatting platform, Scaniverse, is now available on Android.

Michael Rubloff

May 21, 2024

Platforms

Scaniverse arrives on Android

Gaussian Splatting platform, Scaniverse, is now available on Android.

Michael Rubloff

Trending articles

Trending articles

Trending articles

Platforms

Google CloudNeRF: Zip-NeRF and CamP in the Cloud

It doesn't seem like a lot of people know this, but you can run CamP and Zip-NeRF in the cloud, straight through Google and it's actually super easy. It’s called CloudNeRF.

Michael Rubloff

May 8, 2024

Platforms

Google CloudNeRF: Zip-NeRF and CamP in the Cloud

It doesn't seem like a lot of people know this, but you can run CamP and Zip-NeRF in the cloud, straight through Google and it's actually super easy. It’s called CloudNeRF.

Michael Rubloff

May 8, 2024

Platforms

Google CloudNeRF: Zip-NeRF and CamP in the Cloud

It doesn't seem like a lot of people know this, but you can run CamP and Zip-NeRF in the cloud, straight through Google and it's actually super easy. It’s called CloudNeRF.

Michael Rubloff

Research

Gaustudio

Gaussian Splatting methods have continued to pour in over the first three months of the year. With the rate of adoption, being able to merge and compare these methods, shortly after their release would be amazing.

Michael Rubloff

Apr 8, 2024

Research

Gaustudio

Gaussian Splatting methods have continued to pour in over the first three months of the year. With the rate of adoption, being able to merge and compare these methods, shortly after their release would be amazing.

Michael Rubloff

Apr 8, 2024

Research

Gaustudio

Gaussian Splatting methods have continued to pour in over the first three months of the year. With the rate of adoption, being able to merge and compare these methods, shortly after their release would be amazing.

Michael Rubloff

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Research

The MERF that turned into a SMERF

For the long time readers of this site, earlier this year, we looked into Google Research's Memory Efficient Radiance Fields (MERF). Now, they're back with another groundbreaking method: Streamable Memory Efficient Radiance Fields, or SMERF.

Michael Rubloff

Dec 13, 2023

Research

The MERF that turned into a SMERF

For the long time readers of this site, earlier this year, we looked into Google Research's Memory Efficient Radiance Fields (MERF). Now, they're back with another groundbreaking method: Streamable Memory Efficient Radiance Fields, or SMERF.

Michael Rubloff

Dec 13, 2023

Research

The MERF that turned into a SMERF

For the long time readers of this site, earlier this year, we looked into Google Research's Memory Efficient Radiance Fields (MERF). Now, they're back with another groundbreaking method: Streamable Memory Efficient Radiance Fields, or SMERF.

Michael Rubloff

Featured

Featured

Platforms

Google CloudNeRF: Zip-NeRF and CamP in the Cloud

It doesn't seem like a lot of people know this, but you can run CamP and Zip-NeRF in the cloud, straight through Google and it's actually super easy. It’s called CloudNeRF.

Michael Rubloff

May 8, 2024

Platforms

Google CloudNeRF: Zip-NeRF and CamP in the Cloud

It doesn't seem like a lot of people know this, but you can run CamP and Zip-NeRF in the cloud, straight through Google and it's actually super easy. It’s called CloudNeRF.

Michael Rubloff

May 8, 2024

Platforms

Google CloudNeRF: Zip-NeRF and CamP in the Cloud

Michael Rubloff

May 8, 2024

Research

Gaustudio

Gaussian Splatting methods have continued to pour in over the first three months of the year. With the rate of adoption, being able to merge and compare these methods, shortly after their release would be amazing.

Michael Rubloff

Apr 8, 2024

Gaustudio

Research

Gaustudio

Gaussian Splatting methods have continued to pour in over the first three months of the year. With the rate of adoption, being able to merge and compare these methods, shortly after their release would be amazing.

Michael Rubloff

Apr 8, 2024

Gaustudio

Research

Gaustudio

Michael Rubloff

Apr 8, 2024

Gaustudio

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

SplaTV

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Kevin Kwok, perhaps better known as Antimatter15, has released something amazing: splaTV.

Michael Rubloff

Mar 15, 2024

SplaTV

Tools

splaTV: Dynamic Gaussian Splatting Viewer

Michael Rubloff

Mar 15, 2024

SplaTV