Skip to yearly menu bar Skip to main content


Poster

Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration

JUNSEONG KIM · GeonU Kim · Kim Yu-Ji · Yu-Chiang Frank Wang · Jaesung Choe · Tae-Hyun Oh

Highlight Highlight
[ ] [ Project Page ] [ Paper PDF ]
[ Poster
Sat 14 Jun 8:30 a.m. PDT — 10:30 a.m. PDT

Abstract:

We introduce Dr. Splat, a novel approach for open-vocabulary 3D scene understanding leveraging 3D Gaussian Splatting. Unlike existing language-embedded 3DGS methods, which rely on a rendering process, our method directly associates language-aligned CLIP embeddings with 3D Gaussians for holistic 3D scene understanding. The key of our method is a language feature registration technique where CLIP embeddings are assigned to the dominant Gaussians intersected by each pixel-ray. Moreover, we integrate Product Quantization (PQ) trained on general large scale image data to compactly represent embeddings without per-scene optimization. Experiments demonstrate that our approach significantly outperforms existing approaches in 3D perception benchmarks, such as open-vocabulary 3D semantic segmentation, 3D object localization, and 3D object selection tasks. Code will be publicly available if accepted.

Chat is not available.