360张图像和点云之间的注意力增强的跨模式定位

论文标题

360张图像和点云之间的注意力增强的跨模式定位

Attention-Enhanced Cross-modal Localization Between 360 Images and Point Clouds

论文作者

Zhao, Zhipeng, Yu, Huai, Lyv, Chenwei, Yang, Wen, Scherer, Sebastian

论文摘要

视觉定位对于智能机器人和自动驾驶起着重要作用，尤其是当GNSS的准确性不可靠时。最近，在LiDar地图中的摄像头定位引起了越来越多的关注，因为它的低成本和潜在的照明和天气变化。但是，常用的针孔相机具有狭窄的视野，因此与Omni方向LIDAR数据相比，信息有限。为了克服这一限制，我们专注于将360个等级图像的信息与点云相关联，提出了一个端到端可学习的网络，以通过在高维特征空间中建立相似性来进行跨模式的视觉定位。受到注意机制的启发，我们优化了网络，以捕获比较图像和点云的显着特征。我们构建了基于Kitti-360数据集的几个序列，这些序列包含360张等法图像和相应的点云，并进行广泛的实验。结果证明了我们方法的有效性。

Visual localization plays an important role for intelligent robots and autonomous driving, especially when the accuracy of GNSS is unreliable. Recently, camera localization in LiDAR maps has attracted more and more attention for its low cost and potential robustness to illumination and weather changes. However, the commonly used pinhole camera has a narrow Field-of-View, thus leading to limited information compared with the omni-directional LiDAR data. To overcome this limitation, we focus on correlating the information of 360 equirectangular images to point clouds, proposing an end-to-end learnable network to conduct cross-modal visual localization by establishing similarity in high-dimensional feature space. Inspired by the attention mechanism, we optimize the network to capture the salient feature for comparing images and point clouds. We construct several sequences containing 360 equirectangular images and corresponding point clouds based on the KITTI-360 dataset and conduct extensive experiments. The results demonstrate the effectiveness of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题