论文标题

路径:路径一致的激光镜摄像机深度融合

PathFusion: Path-consistent Lidar-Camera Deep Feature Fusion

论文作者

Wu, Lemeng, Wang, Dilin, Li, Meng, Xiong, Yunyang, Krishnamoorthi, Raghuraman, Liu, Qiang, Chandra, Vikas

论文摘要

融合3D LiDAR功能与2D相机功能是一种有前途的技术,可增强3D检测的准确性,这要归功于它们的互补物理特性。尽管大多数现有方法都集中在与原始LIDAR点云或浅层3D功能的直接融合相机功能上,但观察到,在更深层中直接组合2D和3D功能实际上会导致由于功能不对而导致的准确性降低。这种未对准源于从大型接受场中学到的特征的聚集,随着我们深入研究更深的层,越来越严重。在本文中,我们提出了道路作为解决方案的解决方案,以使语义上连贯的激光镜摄像机深度特征融合对齐。 PathFusion在网络内的多个阶段引入了路径一致性损失,从而鼓励2D主链及其融合路径以一种与3D主链的转换相一致的方式转换2D特征。这确保了2D和3D功能之间的语义一致性,即使在更深的层中,也可以放大网络学习能力的使用。我们应用路径输送以改善先前的融合基线,焦点转换,并观察到努斯曲奈斯测试的地图上的提高超过1.6%,并在有没有测试时间数据增加的情况下始终如一地拆分,此外,路径的灌注也改进了Kitti $ \ text {ap} {ap} _ {\ text {\ text {\ text {3D}} $ 0.6%。

Fusing 3D LiDAR features with 2D camera features is a promising technique for enhancing the accuracy of 3D detection, thanks to their complementary physical properties. While most of the existing methods focus on directly fusing camera features with raw LiDAR point clouds or shallow-level 3D features, it is observed that directly combining 2D and 3D features in deeper layers actually leads to a decrease in accuracy due to feature misalignment. The misalignment, which stems from the aggregation of features learned from large receptive fields, becomes increasingly more severe as we delve into deeper layers. In this paper, we propose PathFusion as a solution to enable the alignment of semantically coherent LiDAR-camera deep feature fusion. PathFusion introduces a path consistency loss at multiple stages within the network, encouraging the 2D backbone and its fusion path to transform 2D features in a way that aligns semantically with the transformation of the 3D backbone. This ensures semantic consistency between 2D and 3D features, even in deeper layers, and amplifies the usage of the network's learning capacity. We apply PathFusion to improve a prior-art fusion baseline, Focals Conv, and observe an improvement of over 1.6% in mAP on the nuScenes test split consistently with and without testing-time data augmentations, and moreover, PathFusion also improves KITTI $\text{AP}_{\text{3D}}$ (R11) by about 0.6% on the moderate level.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源