QDTRACK：仅出现多个对象跟踪的准密集相似性学习

论文标题

QDTRACK：仅出现多个对象跟踪的准密集相似性学习

QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple Object Tracking

论文作者

Fischer, Tobias, Huang, Thomas E., Pang, Jiangmiao, Qiu, Linlu, Chen, Haofeng, Darrell, Trevor, Yu, Fisher

论文摘要

相似性学习已被认为是对象跟踪的关键步骤。但是，现有的多个对象跟踪方法仅将稀疏的地面真相匹配作为训练目标，同时忽略了图像中大多数信息的区域。在本文中，我们提出了准密集的相似性学习，该学习密集地在一对图像上进行了数百个对象区域以进行对比学习。我们将这种相似性学习与多个现有对象检测器结合起来，以构建准密度跟踪（QDTrack），该跟踪不需要位移回归或运动先验。我们发现所产生的独特特征空间在推理时间允许对象关联的一个简单最近的邻居搜索。此外，我们表明我们的相似性学习方案不仅限于视频数据，而是可以从静态输入中学习有效的实例相似性，从而无需在视频上进行培训或使用跟踪监督就可以实现竞争性跟踪性能。我们对各种流行的MOT基准进行了广泛的实验。我们发现，尽管它很简单，但Qdtrack却可以在所有基准测试中竞争最先进的跟踪方法的性能，并在大型BDD100K MOT基准上设置了新的最先进的方法，同时向检测器引入了可忽略的计算开销。

Similarity learning has been recognized as a crucial step for object tracking. However, existing multiple object tracking methods only use sparse ground truth matching as the training objective, while ignoring the majority of the informative regions in images. In this paper, we present Quasi-Dense Similarity Learning, which densely samples hundreds of object regions on a pair of images for contrastive learning. We combine this similarity learning with multiple existing object detectors to build Quasi-Dense Tracking (QDTrack), which does not require displacement regression or motion priors. We find that the resulting distinctive feature space admits a simple nearest neighbor search at inference time for object association. In addition, we show that our similarity learning scheme is not limited to video data, but can learn effective instance similarity even from static input, enabling a competitive tracking performance without training on videos or using tracking supervision. We conduct extensive experiments on a wide variety of popular MOT benchmarks. We find that, despite its simplicity, QDTrack rivals the performance of state-of-the-art tracking methods on all benchmarks and sets a new state-of-the-art on the large-scale BDD100K MOT benchmark, while introducing negligible computational overhead to the detector.

下载PDF全文

下载文献需遵守相关版权规定

论文标题