用于多个对象跟踪的准密集相似性学习

论文标题

用于多个对象跟踪的准密集相似性学习

Quasi-Dense Similarity Learning for Multiple Object Tracking

论文作者

Pang, Jiangmiao, Qiu, Linlu, Li, Xia, Chen, Haofeng, Li, Qi, Darrell, Trevor, Yu, Fisher

论文摘要

相似性学习已被认为是对象跟踪的关键步骤。但是，现有的多个对象跟踪方法仅将稀疏的地面真相匹配作为训练目标，同时忽略了图像上的大多数信息区域。在本文中，我们提出了准密集的相似性学习，该学习密集地在一对图像上进行了数百个区域建议，以进行对比学习。我们可以将这种相似性学习与现有检测方法直接相结合，以构建准密度跟踪（QDTRACK），而无需转向位移回归或运动先验。我们还发现，所得的独特特征空间在推理时间允许一个简单的最近邻居搜索。尽管它很简单，但QDTrack还是在MOT，BDD100K，Waymo和Tao跟踪基准的所有现有方法上的表现。它在MOT17上以20.3 fps的速度达到68.7 MOTA，而无需使用外部训练数据。与具有相似检测器的方法相比，它可以提高近10点MOTA，并显着减少BDD100K和Waymo数据集的ID开关数量。我们的代码和训练有素的模型可在http://vis.xyz/pub/qdtrack上找到。

Similarity learning has been recognized as a crucial step for object tracking. However, existing multiple object tracking methods only use sparse ground truth matching as the training objective, while ignoring the majority of the informative regions on the images. In this paper, we present Quasi-Dense Similarity Learning, which densely samples hundreds of region proposals on a pair of images for contrastive learning. We can directly combine this similarity learning with existing detection methods to build Quasi-Dense Tracking (QDTrack) without turning to displacement regression or motion priors. We also find that the resulting distinctive feature space admits a simple nearest neighbor search at the inference time. Despite its simplicity, QDTrack outperforms all existing methods on MOT, BDD100K, Waymo, and TAO tracking benchmarks. It achieves 68.7 MOTA at 20.3 FPS on MOT17 without using external training data. Compared to methods with similar detectors, it boosts almost 10 points of MOTA and significantly decreases the number of ID switches on BDD100K and Waymo datasets. Our code and trained models are available at http://vis.xyz/pub/qdtrack.

下载PDF全文

下载文献需遵守相关版权规定

论文标题