PANOPTIC NERF：PANOPTIC URBAN场景细分的3D到2D标签传输

论文标题

PANOPTIC NERF：PANOPTIC URBAN场景细分的3D到2D标签传输

Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

论文作者

Fu, Xiao, Zhang, Shangzhan, Chen, Tianrun, Lu, Yichong, Zhu, Lanyun, Zhou, Xiaowei, Geiger, Andreas, Liao, Yiyi

论文摘要

具有高质量注释的大规模培训数据对于训练语义和实例分割模型至关重要。不幸的是，像素的注释是劳动密集型且昂贵的，从而提高了对更有效的标签策略的需求。在这项工作中，我们提出了一种新颖的3D到2D标签传输方法Panoptic Nerf，该方法旨在从易于使用的粗3D边界原始原始原始基原始素中获取每个像素2D语义和实例标签。我们的方法将NERF用作可区分的工具来统一从现有数据集中传输的粗3D注释和2D语义提示。我们证明，这种组合允许通过语义信息进行改进的几何形状，从而可以在多个视图中渲染准确的语义图。此外，这种融合过程解决了粗3D注释的标签歧义，并过滤了2D预测中的噪声。通过推断3D空间并渲染到2D标签，我们的2D语义和实例标签是按设计一致的多视图。实验结果表明，在挑战Kitti-360数据集的挑战性城市场景上，Pastic Nerf的表现优于现有标签传输方法。

Large-scale training data with high-quality annotations is critical for training semantic and instance segmentation models. Unfortunately, pixel-wise annotation is labor-intensive and costly, raising the demand for more efficient labeling strategies. In this work, we present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels from easy-to-obtain coarse 3D bounding primitives. Our method utilizes NeRF as a differentiable tool to unify coarse 3D annotations and 2D semantic cues transferred from existing datasets. We demonstrate that this combination allows for improved geometry guided by semantic information, enabling rendering of accurate semantic maps across multiple views. Furthermore, this fusion process resolves label ambiguity of the coarse 3D annotations and filters noise in the 2D predictions. By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design. Experimental results show that Panoptic NeRF outperforms existing label transfer methods in terms of accuracy and multi-view consistency on challenging urban scenes of the KITTI-360 dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题