改进的木筏，训练有混合数据集，以挑战2022

论文标题

改进的木筏，训练有混合数据集，以挑战2022

An Improved RaftStereo Trained with A Mixed Dataset for the Robust Vision Challenge 2022

论文作者

Jiang, Hualie, Xu, Rui, Jiang, Wenjie

论文摘要

立体声匹配是计算机视觉中的基本问题。尽管最近通过深度学习取得了进步，但在将立体匹配模型部署到现实世界应用程序中时，提高鲁棒性是不可避免的。与共同的实践不同，即开发一个精致的模型以实现鲁棒性，我们认为收集多个可用数据集进行培训是提高概括能力的一种更便宜的方法。具体而言，本报告提出了一个改进的木筏，该报告用七个公共数据集的混合数据集进行了训练，以实现强大的视力挑战（称为iraftsereo_rvc）。当对Middlebury，Kitti-2015和ETH3D的训练集进行评估时，该模型的表现优于仅使用一个数据集训练的对应者，例如流行的SceneFlow。在对挑战的三个数据集上进行了细微调整预训练的模型之后，它在立体声排行榜上排名第二，证明了混合数据集预训练的好处。

Stereo-matching is a fundamental problem in computer vision. Despite recent progress by deep learning, improving the robustness is ineluctable when deploying stereo-matching models to real-world applications. Different from the common practices, i.e., developing an elaborate model to achieve robustness, we argue that collecting multiple available datasets for training is a cheaper way to increase generalization ability. Specifically, this report presents an improved RaftStereo trained with a mixed dataset of seven public datasets for the robust vision challenge (denoted as iRaftStereo_RVC). When evaluated on the training sets of Middlebury, KITTI-2015, and ETH3D, the model outperforms its counterparts trained with only one dataset, such as the popular Sceneflow. After fine-tuning the pre-trained model on the three datasets of the challenge, it ranks at 2nd place on the stereo leaderboard, demonstrating the benefits of mixed dataset pre-training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题