3D场景流的无监督学习借助3D探光辅助

论文标题

3D场景流的无监督学习借助3D探光辅助

Unsupervised Learning of 3D Scene Flow with 3D Odometry Assistance

论文作者

Wang, Guangming, Feng, Zhiheng, Jiang, Chaokang, Wang, Hesheng

论文摘要

场景流表示场景中每个点的3D运动，该动作明确描述了每个点运动的距离和方向。场景流估计用于各种应用，例如自主驾驶领域，活动识别和虚拟现实字段。由于对现实世界数据的地面真相的注释场景是具有挑战性的，因此没有可用的现实数据集可提供大量数据，并具有地面真相以进行场景流估计。因此，许多作品使用合成的数据将其网络和现实世界中的LIDAR数据预先培训。与以前的无监督学习场景流程中的云中的学习流程不同，我们建议使用探视信息来帮助无监督的场景流程学习，并使用现实世界中的激光雷达数据来训练我们的网络。有监督的探测器为场景流提供了更准确的共享成本量。此外，提议的网络具有掩模加权纱层，以获得更准确的预测点云。经线操作意味着将估计的姿势转换或场景流到源点云中以获得预测的点云，这是精炼场景从粗糙到罚款的关键。执行翘曲操作时，不同状态的点使用不同的权重进行姿势转换和场景流动转换。我们将点状态分类为静态，动态和遮挡，其中静态掩模用于将静态和动态点划分，并且用遮挡掩码用于划分闭塞点。掩模加权经线层表明在执行经线操作时，将静态面膜和遮挡面膜用作权重。我们的设计被证明在消融实验中有效。实验结果表明，在现实世界中，3D场景流的无监督学习方法的前景是有希望的。

Scene flow represents the 3D motion of each point in the scene, which explicitly describes the distance and the direction of each point's movement. Scene flow estimation is used in various applications such as autonomous driving fields, activity recognition, and virtual reality fields. As it is challenging to annotate scene flow with ground truth for real-world data, this leaves no real-world dataset available to provide a large amount of data with ground truth for scene flow estimation. Therefore, many works use synthesized data to pre-train their network and real-world LiDAR data to finetune. Unlike the previous unsupervised learning of scene flow in point clouds, we propose to use odometry information to assist the unsupervised learning of scene flow and use real-world LiDAR data to train our network. Supervised odometry provides more accurate shared cost volume for scene flow. In addition, the proposed network has mask-weighted warp layers to get a more accurate predicted point cloud. The warp operation means applying an estimated pose transformation or scene flow to a source point cloud to obtain a predicted point cloud and is the key to refining scene flow from coarse to fine. When performing warp operations, the points in different states use different weights for the pose transformation and scene flow transformation. We classify the states of points as static, dynamic, and occluded, where the static masks are used to divide static and dynamic points, and the occlusion masks are used to divide occluded points. The mask-weighted warp layer indicates that static masks and occlusion masks are used as weights when performing warp operations. Our designs are proved to be effective in ablation experiments. The experiment results show the promising prospect of an odometry-assisted unsupervised learning method for 3D scene flow in real-world data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题