基于激光雷达的4D全景分割通过动态移动网络

论文标题

基于激光雷达的4D全景分割通过动态移动网络

LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network

论文作者

Hong, Fangzhou, Zhou, Hui, Zhu, Xinge, Li, Hongsheng, Liu, Ziwei

论文摘要

随着自动驾驶的快速发展，将其传感系统配备更全面的3D感知变得至关重要。但是，现有作品着重于解析从激光雷达传感器中的物体（汽车和行人）或场景（例如树木和建筑物）。在这项工作中，我们解决了基于激光雷达的全景分段的任务，该任务旨在以统一的方式解析对象和场景。作为对这项新挑战性任务的首次努力之一，我们提出了动态变化网络（DS-NET），该网络是Point Cloud Realm中有效的泛型分割框架。特别是，DS-NET具有三个吸引人的特性：1）强型骨干设计。 DS-NET采用专门为LIDAR点云设计的气缸卷积。 2）用于复杂点分布的动态转移。我们观察到，通常使用的聚类算法无法处理具有不均匀点云分布和不同实例大小的复杂自主驾驶场景。因此，我们提出了一个有效的可学习聚类模块，动态变化，该模块可以随时适应内核功能。 3）扩展到4D预测。此外，我们通过对齐的激光镜框架上的时间统一的实例聚类将DS-NET扩展到4D Panoptic LiDAR分割。为了全面评估基于激光雷达的全景分段的性能，我们从两个大规模自动驾驶LIDAR数据集，Semantickitti和Nuscenes构建和策划基准测试。广泛的实验表明，我们提出的DS-NET在这两个任务中都具有比当前最新方法的优越精度。值得注意的是，在任务的单个帧版本中，我们的PQ度量胜于SOTA方法1.8％。在该任务的4D版本中，我们就LSTQ度量标准超过了5.4％。

With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. However, existing works focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g. trees and buildings) from the LiDAR sensor. In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner. As one of the first endeavors towards this new challenging task, we propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. In particular, DS-Net has three appealing properties: 1) Strong backbone design. DS-Net adopts the cylinder convolution that is specifically designed for LiDAR point clouds. 2) Dynamic Shifting for complex point distributions. We observe that commonly-used clustering algorithms are incapable of handling complex autonomous driving scenes with non-uniform point cloud distributions and varying instance sizes. Thus, we present an efficient learnable clustering module, dynamic shifting, which adapts kernel functions on the fly for different instances. 3) Extension to 4D prediction. Furthermore, we extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames. To comprehensively evaluate the performance of LiDAR-based panoptic segmentation, we construct and curate benchmarks from two large-scale autonomous driving LiDAR datasets, SemanticKITTI and nuScenes. Extensive experiments demonstrate that our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks. Notably, in the single frame version of the task, we outperform the SOTA method by 1.8% in terms of the PQ metric. In the 4D version of the task, we surpass 2nd place by 5.4% in terms of the LSTQ metric.

下载PDF全文

下载文献需遵守相关版权规定

论文标题