基于单程视觉的众包3D交通标志定位，带有未知的相机内在和失真系数

论文标题

基于单程视觉的众包3D交通标志定位，带有未知的相机内在和失真系数

Monocular Vision based Crowdsourced 3D Traffic Sign Positioning with Unknown Camera Intrinsics and Distortion Coefficients

论文作者

Chawla, Hemang, Jukola, Matti, Arani, Elahe, Zonooz, Bahram

论文摘要

自动驾驶汽车和驾驶员援助系统利用3D语义地标的地图来改进决策。但是，扩展映射过程以及定期更新此类地图的成本很大。这些地标（例如交通标志位置）的众包映射提供了一种吸引人的选择。众包映射的最先进方法使用地面真相摄像机参数，这可能并不总是知道或可能随着时间而变化。在这项工作中，我们展示了一种计算3D交通符号位置的方法，而不知道相机焦距，主点和失真系数先验。我们在Kitti的公共交通标志公共数据集上验证了我们提出的方法。我们仅使用单眼颜色摄像机和GP，我们达到了平均单一旅程相对的相对相对和绝对定位精度，分别为0.26 m和1.38 m。

Autonomous vehicles and driver assistance systems utilize maps of 3D semantic landmarks for improved decision making. However, scaling the mapping process as well as regularly updating such maps come with a huge cost. Crowdsourced mapping of these landmarks such as traffic sign positions provides an appealing alternative. The state-of-the-art approaches to crowdsourced mapping use ground truth camera parameters, which may not always be known or may change over time. In this work, we demonstrate an approach to computing 3D traffic sign positions without knowing the camera focal lengths, principal point, and distortion coefficients a priori. We validate our proposed approach on a public dataset of traffic signs in KITTI. Using only a monocular color camera and GPS, we achieve an average single journey relative and absolute positioning accuracy of 0.26 m and 1.38 m, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题