Paper
in
Workshop: Open-World 3D Scene Understanding with Foundation Models
Segment Any Primitive: Zero-Shot 3D Primitive Segmentation from Point Cloud
Zhengtao Zhang
3D point cloud primitive segmentation is pivotal for advancing 3D scene understanding, with the primary challenge arising in unstructured and complex object environments in a zero-shot setting. Current methods often rely on real images or large training datasets, limiting their scalability and generalization. To overcome these limitations, we propose Segment Any Primitive (SAP), the zero-shot 3D primitive segmentation framework that does not rely on real images or training modules. SAP introduces an innovative multiview rendering strategy with dual-feature fusion, establishing precise back-projection relationships between images and point clouds. By incorporating Segment Anything 2 (SAM2) for rendered image segmentation, SAP integrates mask features with geometric priors for fine-grained segmentation, using a hybrid affinity graph-clustering algorithm. Unlike existing approaches, SAP eliminates the need for labor-intensive dataset preparation and parameter tuning, achieving superior generalization to unknown objects and scenes. Experimental results validated through a robust evaluation benchmark, demonstrate that SAP outperforms existing zero-shot methods in key metrics, providing a potential solution for real-world robotic applications.