RGB-D PANOPTIC分割的强大双重编码网络

论文标题

RGB-D PANOPTIC分割的强大双重编码网络

Robust Double-Encoder Network for RGB-D Panoptic Segmentation

论文作者

Sodano, Matteo, Magistri, Federico, Guadagnino, Tiziano, Behley, Jens, Stachniss, Cyrill

论文摘要

感知对于在现实世界环境中起作用的机器人至关重要，因为自主系统需要看到和理解周围的世界才能正常行动。 Panoptic分割通过将PixelWise语义标签与实例ID一起计算，从而提供了对场景的解释。在本文中，我们使用室内场景的RGB-D数据解决了泛型分割。我们提出了一个新颖的编码器神经网络，该神经网络通过两个编码器分别处理RGB和深度。单个编码器的功能在不同的分辨率下逐渐合并，从而使用互补的深度信息增强了RGB功能。我们提出了一种新颖的合并方法，称为“残留”，该方法根据特征图的每个条目重新授予其重要性。借助我们的双重编码体架构，我们对缺少提示非常强大。特别是，同一模型可以训练和推断RGB-D，仅RGB，仅限深度输入数据，而无需训练专用模型。我们在公开可用的数据集上评估了我们的方法，并表明我们的方法与其他常见的泛型分割方法相比，取得了优越的结果。

Perception is crucial for robots that act in real-world environments, as autonomous systems need to see and understand the world around them to act properly. Panoptic segmentation provides an interpretation of the scene by computing a pixelwise semantic label together with instance IDs. In this paper, we address panoptic segmentation using RGB-D data of indoor scenes. We propose a novel encoder-decoder neural network that processes RGB and depth separately through two encoders. The features of the individual encoders are progressively merged at different resolutions, such that the RGB features are enhanced using complementary depth information. We propose a novel merging approach called ResidualExcite, which reweighs each entry of the feature map according to its importance. With our double-encoder architecture, we are robust to missing cues. In particular, the same model can train and infer on RGB-D, RGB-only, and depth-only input data, without the need to train specialized models. We evaluate our method on publicly available datasets and show that our approach achieves superior results compared to other common approaches for panoptic segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题