使用半全球匹配的结果进行微调的深度学习模型，用于立体声匹配

论文标题

使用半全球匹配的结果进行微调的深度学习模型，用于立体声匹配

Fine-tuning deep learning models for stereo matching using results from semi-global matching

论文作者

Albanwan, Hessah, Qin, Rongjun

论文摘要

由于其据报道的高精度，深入学习（DL）方法已被广泛研究以进行立体声图像匹配任务。但是，它们的可传递性/概括功能受培训数据中的实例限制。卫星图像涵盖了位置，内容，土地覆盖物和空间模式的大型区域，我们预计它们的表现会受到影响。增加培训数据的数量和多样性始终是一种选择，但是由于其高成本，地面差异受到遥感的限制，几乎不可能获得所有位置的基础真相。因此，我们知道经典的立体声匹配方法（例如基于人口普查的半全球匹配（SGM））被广泛采用来处理不同类型的立体声数据，因此，我们提出了一种填充方法，该方法利用了从目标立体数据上从SGM中得出的差异图。我们提出的方法采用了一个简单的方案，该方案使用从SGM算法得出的能量图来选择高置信度差异测量值，并在同一图像中使用图像限制了这些选择的这些选择的差异测量值。我们的方法旨在调查提高当前DL方法可转移性以看不见目标数据的可能性，而无需将其基础真相作为要求。为了进行全面的研究，我们在全球范围内选择20个研究地点来涵盖各种复杂性和密度。我们选择了良好的DL方法，例如几何和上下文网络（GCNET），金字塔立体声匹配网络（PSMNET）和Leastereo进行评估。我们的结果表明，在视觉和数值上，DL方法在不同区域之间的可传递性有所提高。

Deep learning (DL) methods are widely investigated for stereo image matching tasks due to their reported high accuracies. However, their transferability/generalization capabilities are limited by the instances seen in the training data. With satellite images covering large-scale areas with variances in locations, content, land covers, and spatial patterns, we expect their performances to be impacted. Increasing the number and diversity of training data is always an option, but with the ground-truth disparity being limited in remote sensing due to its high cost, it is almost impossible to obtain the ground-truth for all locations. Knowing that classical stereo matching methods such as Census-based semi-global-matching (SGM) are widely adopted to process different types of stereo data, we therefore, propose a finetuning method that takes advantage of disparity maps derived from SGM on target stereo data. Our proposed method adopts a simple scheme that uses the energy map derived from the SGM algorithm to select high confidence disparity measurements, at the same utilizing the images to limit these selected disparity measurements on texture-rich regions. Our approach aims to investigate the possibility of improving the transferability of current DL methods to unseen target data without having their ground truth as a requirement. To perform a comprehensive study, we select 20 study-sites around the world to cover a variety of complexities and densities. We choose well-established DL methods like geometric and context network (GCNet), pyramid stereo matching network (PSMNet), and LEAStereo for evaluation. Our results indicate an improvement in the transferability of the DL methods across different regions visually and numerically.

下载PDF全文

下载文献需遵守相关版权规定

论文标题