论文标题
structvpr:加权样品的蒸馏结构知识以识别视觉位置
StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition
论文作者
论文摘要
视觉位置识别(VPR)通常被视为特定图像检索问题。受现有培训框架的限制,大多数基于深度学习的作品无法从RGB图像中提取足够稳定的全球功能,并依靠耗时的重新级别步骤来利用空间结构信息以提高性能。在本文中,我们提出了VPR的新型培训架构structvpr,以增强RGB全球特征的结构知识,从而在不断变化的环境中提高功能稳定性。具体而言,structvpr使用分割图像作为CNN网络中的结构知识输入的更确定的来源,并应用知识蒸馏以避免在在线分割和SEG分支机构在测试中的推断。考虑到并非所有样品都包含高质量和有用的知识,有些人甚至损害了蒸馏的性能,我们分配样本并权衡每个样品的蒸馏损失,以精确地增强预期知识。最后,structvpr仅使用全球检索,甚至超过许多两阶段方法的差距,在几个基准上实现了令人印象深刻的性能。在添加了额外的重新排列后,我们的计算成本较低,可以实现最先进的性能。
Visual place recognition (VPR) is usually considered as a specific image retrieval problem. Limited by existing training frameworks, most deep learning-based works cannot extract sufficiently stable global features from RGB images and rely on a time-consuming re-ranking step to exploit spatial structural information for better performance. In this paper, we propose StructVPR, a novel training architecture for VPR, to enhance structural knowledge in RGB global features and thus improve feature stability in a constantly changing environment. Specifically, StructVPR uses segmentation images as a more definitive source of structural knowledge input into a CNN network and applies knowledge distillation to avoid online segmentation and inference of seg-branch in testing. Considering that not all samples contain high-quality and helpful knowledge, and some even hurt the performance of distillation, we partition samples and weigh each sample's distillation loss to enhance the expected knowledge precisely. Finally, StructVPR achieves impressive performance on several benchmarks using only global retrieval and even outperforms many two-stage approaches by a large margin. After adding additional re-ranking, ours achieves state-of-the-art performance while maintaining a low computational cost.