自我监督的单眼深度估计：通过语义指导解决动态对象问题

论文标题

自我监督的单眼深度估计：通过语义指导解决动态对象问题

Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance

论文作者

Klingner, Marvin, Termöhlen, Jan-Aike, Mikolajczyk, Jonas, Fingscheidt, Tim

论文摘要

自我监督的单眼深度估计提出了一种从单个摄像头图像中获取3D场景信息的强大方法，该方法可以在任意图像序列上训练，而无需从激光雷达传感器中获得深度标签。在这项工作中，我们提出了一种新的自我监督语义引导的深度估计（SGDEPTH）方法，用于处理移动动态级（DC）对象，例如移动的汽车和行人，违反了通常在此类模型训练期间做出的静态世界假设。 Specifically, we propose (i) mutually beneficial cross-domain training of (supervised) semantic segmentation and self-supervised depth estimation with task-specific network heads, (ii) a semantic masking scheme providing guidance to prevent moving DC objects from contaminating the photometric loss, and (iii) a detection method for frames with non-moving DC objects, from which the depth of DC objects can be learned.我们在几个基准上，尤其是在特征分裂上演示了我们的方法的性能，我们在没有测试时间精炼的情况下超过了所有基准。

Self-supervised monocular depth estimation presents a powerful method to obtain 3D scene information from single camera images, which is trainable on arbitrary image sequences without requiring depth labels, e.g., from a LiDAR sensor. In this work we present a new self-supervised semantically-guided depth estimation (SGDepth) method to deal with moving dynamic-class (DC) objects, such as moving cars and pedestrians, which violate the static-world assumptions typically made during training of such models. Specifically, we propose (i) mutually beneficial cross-domain training of (supervised) semantic segmentation and self-supervised depth estimation with task-specific network heads, (ii) a semantic masking scheme providing guidance to prevent moving DC objects from contaminating the photometric loss, and (iii) a detection method for frames with non-moving DC objects, from which the depth of DC objects can be learned. We demonstrate the performance of our method on several benchmarks, in particular on the Eigen split, where we exceed all baselines without test-time refinement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题