论文标题
通过动态移动网络基于激光雷达的全面分割
LiDAR-based Panoptic Segmentation via Dynamic Shifting Network
论文作者
论文摘要
随着自动驾驶的快速发展,将其传感系统配备更全面的3D感知变得至关重要。但是,现有作品着重于解析从激光雷达传感器中的物体(汽车和行人)或场景(例如树木和建筑物)。在这项工作中,我们解决了基于激光雷达的全景分段的任务,该任务旨在以统一的方式解析对象和场景。作为对这项新挑战性任务的首次努力之一,我们提出了动态变化网络(DS-NET),该网络是Point Cloud Realm中有效的泛型分割框架。特别是,DS-NET具有三个吸引人的特性:1)强型骨干设计。 DS-NET采用专门为LIDAR点云设计的气缸卷积。提取的功能由语义分支和以自下而上的聚类样式运行的实例分支共享。 2)用于复杂点分布的动态转移。我们观察到,诸如BFS或DBSCAN之类的常用聚类算法无法处理具有非均匀点云分布和不同实例大小的复杂自主驾驶场景。因此,我们提出了一个有效的可学习聚类模块,动态变化,该模块可以适应不同实例的内核函数。 3)共识驱动的融合。最后,共识驱动的融合用于处理语义预测和实例预测之间的分歧。为了全面评估基于激光雷达的全景分段的性能,我们从两个大规模自动驾驶LIDAR数据集,Semantickitti和Nuscenes构建和策划基准测试。广泛的实验表明,我们提出的DS-NET比当前最新方法具有更高的精度。值得注意的是,我们在Semantickitti的公共排行榜上获得第一名,就PQ指标而言,优于第二名2.6%。
With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. However, existing works focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g. trees and buildings) from the LiDAR sensor. In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner. As one of the first endeavors towards this new challenging task, we propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. In particular, DS-Net has three appealing properties: 1) strong backbone design. DS-Net adopts the cylinder convolution that is specifically designed for LiDAR point clouds. The extracted features are shared by the semantic branch and the instance branch which operates in a bottom-up clustering style. 2) Dynamic Shifting for complex point distributions. We observe that commonly-used clustering algorithms like BFS or DBSCAN are incapable of handling complex autonomous driving scenes with non-uniform point cloud distributions and varying instance sizes. Thus, we present an efficient learnable clustering module, dynamic shifting, which adapts kernel functions on-the-fly for different instances. 3) Consensus-driven Fusion. Finally, consensus-driven fusion is used to deal with the disagreement between semantic and instance predictions. To comprehensively evaluate the performance of LiDAR-based panoptic segmentation, we construct and curate benchmarks from two large-scale autonomous driving LiDAR datasets, SemanticKITTI and nuScenes. Extensive experiments demonstrate that our proposed DS-Net achieves superior accuracies over current state-of-the-art methods. Notably, we achieve 1st place on the public leaderboard of SemanticKITTI, outperforming 2nd place by 2.6% in terms of the PQ metric.