Monorec：单个移动摄像头的动态环境中半监督的密集重建

论文标题

Monorec：单个移动摄像头的动态环境中半监督的密集重建

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

论文作者

Wimbauer, Felix, Yang, Nan, von Stumberg, Lukas, Zeller, Niclas, Cremers, Daniel

论文摘要

在本文中，我们提出了Monorec，这是一种半监督的单眼密集重建体系结构，该体系结构可预测动态环境中单个移动摄像头的深度图。 Monorec基于多视图立体设置，该设置在成本量中编码多个连续图像的信息。为了处理场景中的动态对象，我们引入了一个蒙版模块，该模块通过利用成本量中编码的光度不一致来预测移动对象掩模。与其他多视角立体声方法不同，Monorec能够通过利用预测的掩码来重建静态对象和移动对象。此外，我们提出了一种新型的多阶段训练方案，并具有半监督的损失公式，不需要LiDAR深度值。我们仔细评估了Kitti数据集上的Monorec，并表明它与多视图和单视图方法相比，它实现了最先进的性能。通过对Kitti进行培训的模型，我们进一步证明了Monorec能够很好地概括牛津Robotcar数据集和手持式摄像头录制的更具挑战性的Tum-Mono数据集。代码和相关材料将在https://vision.in.tum.de/research/monorec上找到。

In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. MonoRec is based on a multi-view stereo setting which encodes the information of multiple consecutive images in a cost volume. To deal with dynamic objects in the scene, we introduce a MaskModule that predicts moving object masks by leveraging the photometric inconsistencies encoded in the cost volumes. Unlike other multi-view stereo methods, MonoRec is able to reconstruct both static and moving objects by leveraging the predicted masks. Furthermore, we present a novel multi-stage training scheme with a semi-supervised loss formulation that does not require LiDAR depth values. We carefully evaluate MonoRec on the KITTI dataset and show that it achieves state-of-the-art performance compared to both multi-view and single-view methods. With the model trained on KITTI, we further demonstrate that MonoRec is able to generalize well to both the Oxford RobotCar dataset and the more challenging TUM-Mono dataset recorded by a handheld camera. Code and related materials will be available at https://vision.in.tum.de/research/monorec.

下载PDF全文

下载文献需遵守相关版权规定

论文标题