论文标题
对城市场景中多个对象跟踪的视觉特征的经验分析
An Empirical Analysis of Visual Features for Multiple Object Tracking in Urban Scenes
论文作者
论文摘要
本文解决了在城市场景中为多个对象跟踪(MOT)选择外观功能的问题。多年来,MOT已使用大量功能。但是,尚不清楚其中一些是否比其他更好。常用的特征是颜色直方图,定向梯度的直方图,卷积神经网络的深度特征以及重新识别(REID)特征。在这项研究中,我们评估了这些特征在区分城市场景跟踪场景中的边界框所包围的物体方面的良好性。还评估了几种亲和力措施,即$ \ mathrm {l} _1 $,$ \ mathrm {l} _2 $和bhattacharyya距离,rank-1计数和余弦的相似性,还对其对功能的歧视力量的影响进行了评估。几个数据集中的结果表明,REID网络的功能是彼此区分实例的最佳选择,无论检测器的质量如何。如果没有REID模型,则如果检测器的召回良好,并且几乎没有阻塞,则可以选择颜色直方图。否则,深度功能对于较低的召回探测器将更加强大。项目页面是http://www.mehdimiah.com/visual_features。
This paper addresses the problem of selecting appearance features for multiple object tracking (MOT) in urban scenes. Over the years, a large number of features has been used for MOT. However, it is not clear whether some of them are better than others. Commonly used features are color histograms, histograms of oriented gradients, deep features from convolutional neural networks and re-identification (ReID) features. In this study, we assess how good these features are at discriminating objects enclosed by a bounding box in urban scene tracking scenarios. Several affinity measures, namely the $\mathrm{L}_1$, $\mathrm{L}_2$ and the Bhattacharyya distances, Rank-1 counts and the cosine similarity, are also assessed for their impact on the discriminative power of the features. Results on several datasets show that features from ReID networks are the best for discriminating instances from one another regardless of the quality of the detector. If a ReID model is not available, color histograms may be selected if the detector has a good recall and there are few occlusions; otherwise, deep features are more robust to detectors with lower recall. The project page is http://www.mehdimiah.com/visual_features.