图像了解点云：通过关联学习弱监督的3D语义细分

论文标题

图像了解点云：通过关联学习弱监督的3D语义细分

Image Understands Point Cloud: Weakly Supervised 3D Semantic Segmentation via Association Learning

论文作者

Sun, Tianfang, Zhang, Zhizhong, Tan, Xin, Qu, Yanyun, Xie, Yuan, Ma, Lizhuang

论文摘要

弱监督的点云语义细分方法需要1 \％或更少的标签，希望与完全监督的方法实现几乎相同的性能，这些方法最近引起了广泛的研究关注。该框架中的典型解决方案是使用自我训练或伪标记来从点云本身挖掘监督，但忽略了图像中的关键信息。实际上，在激光雷达场景中广泛存在相机，而这种互补信息对于3D应用似乎非常重要。在本文中，我们提出了一种用于3D分割的新型交叉模式弱监督的方法，并结合了来自未标记图像的互补信息。基本上，我们设计了一个配备有效标签策略的双分支网络，以最大程度地发挥标签的力量，并直接实现2D到3D知识转移。之后，我们以期望最大（EM）的视角建立了一个跨模式的自我训练框架，该框架在伪标签估计和更新参数之间进行了迭代。在M-Step中，我们提出了一个跨模式关联学习，通过增强3D点和2D超级像素之间的周期矛盾性，从图像中挖掘互补的监督。在E-Step中，伪标签的自我校准机制被得出滤波标签，从而为网络提供了更准确的标签，以进行全面训练。广泛的实验结果表明，我们的方法甚至胜过最先进的全面监督竞争对手，而小于1 \％的主动选择注释。

Weakly supervised point cloud semantic segmentation methods that require 1\% or fewer labels, hoping to realize almost the same performance as fully supervised approaches, which recently, have attracted extensive research attention. A typical solution in this framework is to use self-training or pseudo labeling to mine the supervision from the point cloud itself, but ignore the critical information from images. In fact, cameras widely exist in LiDAR scenarios and this complementary information seems to be greatly important for 3D applications. In this paper, we propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images. Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels and directly realize 2D-to-3D knowledge transfer. Afterwards, we establish a cross-modal self-training framework in an Expectation-Maximum (EM) perspective, which iterates between pseudo labels estimation and parameters updating. In the M-Step, we propose a cross-modal association learning to mine complementary supervision from images by reinforcing the cycle-consistency between 3D points and 2D superpixels. In the E-step, a pseudo label self-rectification mechanism is derived to filter noise labels thus providing more accurate labels for the networks to get fully trained. The extensive experimental results demonstrate that our method even outperforms the state-of-the-art fully supervised competitors with less than 1\% actively selected annotations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题