论文标题
Monorec:单个移动摄像头的动态环境中半监督的密集重建
MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
论文作者
论文摘要
在本文中,我们提出了Monorec,这是一种半监督的单眼密集重建体系结构,该体系结构可预测动态环境中单个移动摄像头的深度图。 Monorec基于多视图立体设置,该设置在成本量中编码多个连续图像的信息。为了处理场景中的动态对象,我们引入了一个蒙版模块,该模块通过利用成本量中编码的光度不一致来预测移动对象掩模。与其他多视角立体声方法不同,Monorec能够通过利用预测的掩码来重建静态对象和移动对象。此外,我们提出了一种新型的多阶段训练方案,并具有半监督的损失公式,不需要LiDAR深度值。我们仔细评估了Kitti数据集上的Monorec,并表明它与多视图和单视图方法相比,它实现了最先进的性能。通过对Kitti进行培训的模型,我们进一步证明了Monorec能够很好地概括牛津Robotcar数据集和手持式摄像头录制的更具挑战性的Tum-Mono数据集。代码和相关材料将在https://vision.in.tum.de/research/monorec上找到。
In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. MonoRec is based on a multi-view stereo setting which encodes the information of multiple consecutive images in a cost volume. To deal with dynamic objects in the scene, we introduce a MaskModule that predicts moving object masks by leveraging the photometric inconsistencies encoded in the cost volumes. Unlike other multi-view stereo methods, MonoRec is able to reconstruct both static and moving objects by leveraging the predicted masks. Furthermore, we present a novel multi-stage training scheme with a semi-supervised loss formulation that does not require LiDAR depth values. We carefully evaluate MonoRec on the KITTI dataset and show that it achieves state-of-the-art performance compared to both multi-view and single-view methods. With the model trained on KITTI, we further demonstrate that MonoRec is able to generalize well to both the Oxford RobotCar dataset and the more challenging TUM-Mono dataset recorded by a handheld camera. Code and related materials will be available at https://vision.in.tum.de/research/monorec.