利用CNN层中嵌入的隐性信息进行视觉跟踪

论文标题

利用CNN层中嵌入的隐性信息进行视觉跟踪

Leveraging Tacit Information Embedded in CNN Layers for Visual Tracking

论文作者

Meshgi, Kourosh, Mirzaei, Maryam Sadat, Oba, Shigeyuki

论文摘要

CNN中的不同层不仅提供了不同级别的抽象来描述输入中的对象，还可以编码有关它们的各种隐式信息。不同特征的激活模式包含有关传入图像流的有价值信息：空间关系，时间模式以及时空和时空特征的同时存在。到目前为止，视觉跟踪文献中的研究仅利用了CNN层之一，它们的预固定组合或建立在单个层上的一组跟踪器。在这项研究中，我们在单个DCF跟踪器中采用了几个CNN层的自适应组合来解决目标外观的变化，并提出了直接从CNN层从CNN层中直接提取的目标统计量的样式统计信息，以进行视觉跟踪。实验表明，使用CNN的其他隐式数据可显着提高跟踪器的性能。结果证明了使用样式相似性和激活一致性在提高其本地化和规模准确性方面的有效性。

Different layers in CNNs provide not only different levels of abstraction for describing the objects in the input but also encode various implicit information about them. The activation patterns of different features contain valuable information about the stream of incoming images: spatial relations, temporal patterns, and co-occurrence of spatial and spatiotemporal (ST) features. The studies in visual tracking literature, so far, utilized only one of the CNN layers, a pre-fixed combination of them, or an ensemble of trackers built upon individual layers. In this study, we employ an adaptive combination of several CNN layers in a single DCF tracker to address variations of the target appearances and propose the use of style statistics on both spatial and temporal properties of the target, directly extracted from CNN layers for visual tracking. Experiments demonstrate that using the additional implicit data of CNNs significantly improves the performance of the tracker. Results demonstrate the effectiveness of using style similarity and activation consistency regularization in improving its localization and scale accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题