TriplaneGaussian (TGS) Code Released

Michael Rubloff

Michael Rubloff

Jan 18, 2024

Email
Copy Link
Twitter
Linkedin
Reddit
Whatsapp
TriplaneGaussian
TriplaneGaussian

TriplaneGaussian joins a steadily rising group of platforms that are enabling fast reconstruction from single view image to 3D. TriplaneGaussian comes out of VAST AI Research and previously was only available through a Gradio demo in a Hugging Face Space. It was quick to rise, appearing as one of their Spaces of the Week and featured on Hugging Face's trending page. They have several preloaded examples that you can try from or you can let your imagination run free and see what you can generate.

Originally, they tried to have their method directly predict normal 3D Gaussians, but ran into trouble when they tried having it generate models off a single image. With that in mind, they pivoted towards a hybrid representation that combines the benefits of triplane and point cloud approaches.

In order to power it, they use two transformer networks: a point decoder and a triplane decoder. The point cloud decoder gives a rough approximation of the object's geometry. That rough approximation has local image features integrated with Projected Aware Conditioning. This step is really critical as it results in a high quality point cloud that represents the original input image.

Triplane Gaussian then passes it to a Triplane Decoder.

This decoder employs a deep, ten-layer structure, enabling it to extract more nuanced features from the data. It works by analyzing the positional relationships within the image data, effectively learning how different segments correlate to specific 3D coordinates.

A key element of this process is the integration of point cloud data into the model. By encoding this data into the system’s learnable positional embeddings, the model achieves a heightened level of geometric awareness. This means it can better understand the shapes and contours of the 3D space it's representing.

Enhancing this spatial understanding is augmenting point cloud data with projection features derived from the input images. Further refining the model’s output is the application of PointNet, a neural network specifically tailored for processing point cloud data, combined with local pooling techniques. Local pooling helps in distilling the vast amount of data into more manageable, yet still meaningful, representations.

The final, and perhaps most intriguing, step involves an orthographic projection of these features onto three axis-aligned planes. Orthographic projection, a method of displaying 3D objects in two dimensions, is employed here to align the 3D data with the respective X, Y, and Z axes. This alignment is critical for maintaining the integrity of the 3D structure in a 2D framework.

Once projected, the features that land on the same plane are pooled together and enhanced with the model’s learnable positional embeddings. This step is an alignment of the detailed imagery with the structured point cloud data, resulting in a highly accurate 3D representation.

It's really a lot of fun to play around with and if you download the 3D generation, it will be a .splat file, meaning you have quite a few options of what to do with it. The download instructions are located on their Github or you can experiment directly on Hugging Face. At the time of posting there is no licensing information.

Featured

Recents

Featured

News

Sutro Tower 3DGS Reconstruction

Vincent Woo has brought one of SF's most prominent landmarks into lifelike 3D.

Michael Rubloff

Feb 20, 2025

News

Sutro Tower 3DGS Reconstruction

Vincent Woo has brought one of SF's most prominent landmarks into lifelike 3D.

Michael Rubloff

Feb 20, 2025

News

Sutro Tower 3DGS Reconstruction

Vincent Woo has brought one of SF's most prominent landmarks into lifelike 3D.

Michael Rubloff

Platforms

Apple-Log2Linear: Convert Apple ProRes easily

apple-log2linear makes it easy to Convert Apple ProRes Log video to calibrated 10 bit linear RGB images.

Michael Rubloff

Feb 17, 2025

Platforms

Apple-Log2Linear: Convert Apple ProRes easily

apple-log2linear makes it easy to Convert Apple ProRes Log video to calibrated 10 bit linear RGB images.

Michael Rubloff

Feb 17, 2025

Platforms

Apple-Log2Linear: Convert Apple ProRes easily

apple-log2linear makes it easy to Convert Apple ProRes Log video to calibrated 10 bit linear RGB images.

Michael Rubloff

Platforms

Miris shows Spatial Streaming Preview

MIRIS has shown for the first time some of the things they've been working on.

Michael Rubloff

Feb 14, 2025

Platforms

Miris shows Spatial Streaming Preview

MIRIS has shown for the first time some of the things they've been working on.

Michael Rubloff

Feb 14, 2025

Platforms

Miris shows Spatial Streaming Preview

MIRIS has shown for the first time some of the things they've been working on.

Michael Rubloff

Platforms

SuperSplat Unveils Major Updates: 2.0 is Here

Massive updates have arrived for PlayCanvas's SuperSplat in the 2.0 release.

Michael Rubloff

Feb 13, 2025

Platforms

SuperSplat Unveils Major Updates: 2.0 is Here

Massive updates have arrived for PlayCanvas's SuperSplat in the 2.0 release.

Michael Rubloff

Feb 13, 2025

Platforms

SuperSplat Unveils Major Updates: 2.0 is Here

Massive updates have arrived for PlayCanvas's SuperSplat in the 2.0 release.

Michael Rubloff