论文标题
DS-MVSNET:通过深度合成无监督的多视图立体声
DS-MVSNet: Unsupervised Multi-view Stereo via Depth Synthesis
论文作者
论文摘要
近年来,与传统方法相比,受监督或无监督的基于学习的MVS方法取得了出色的性能。但是,这些方法仅使用成本量正规化计算的概率量来预测参考深度,而这种方式无法从概率量中挖掘出足够的信息。此外,无监督的方法通常会尝试使用两步或其他输入进行训练,从而使过程更加复杂。在本文中,我们提出了DS-MVSNET,这是一种具有源深度合成的端到端无监督的MVS结构。为了挖掘概率量的信息,我们通过将概率量和深度假设溅到源视图来创造性地综合了源深度。同时,我们提出了自适应高斯采样和改进的自适应箱采样方法,以改善深度的准确性。另一方面,我们利用源深度渲染参考图像,并提出深度一致性损失和深度平滑度损失。这些可以根据光度和几何一致性在不同视图的情况下提供其他指导,而无需其他输入。最后,我们在DTU数据集和储罐和庙宇数据集上进行了一系列实验,这些实验证明了与最先进的方法相比,DS-MVSNET的效率和鲁棒性。
In recent years, supervised or unsupervised learning-based MVS methods achieved excellent performance compared with traditional methods. However, these methods only use the probability volume computed by cost volume regularization to predict reference depths and this manner cannot mine enough information from the probability volume. Furthermore, the unsupervised methods usually try to use two-step or additional inputs for training which make the procedure more complicated. In this paper, we propose the DS-MVSNet, an end-to-end unsupervised MVS structure with the source depths synthesis. To mine the information in probability volume, we creatively synthesize the source depths by splattering the probability volume and depth hypotheses to source views. Meanwhile, we propose the adaptive Gaussian sampling and improved adaptive bins sampling approach that improve the depths hypotheses accuracy. On the other hand, we utilize the source depths to render the reference images and propose depth consistency loss and depth smoothness loss. These can provide additional guidance according to photometric and geometric consistency in different views without additional inputs. Finally, we conduct a series of experiments on the DTU dataset and Tanks & Temples dataset that demonstrate the efficiency and robustness of our DS-MVSNet compared with the state-of-the-art methods.