论文标题
TRAT:使用时空特征通过注意力跟踪
TRAT: Tracking by Attention Using Spatio-Temporal Features
论文作者
论文摘要
强大的对象跟踪需要了解跟踪对象的外观,运动及其演变随着时间的流逝。尽管运动提供了独特的和互补的信息,尤其是对于快速移动的对象,但最近的大多数跟踪体系结构主要集中在对象的外观信息上。在本文中,我们提出了一个使用空间和时间特征的两流深神经网络跟踪器。我们的体系结构是通过原子跟踪器开发的,并包含两个骨干:(i)2D-CNN网络以捕获外观功能和(ii)3D-CNN网络以捕获运动功能。然后,两个网络返回的功能与基于注意力的功能聚合模块(FAM)融合在一起。由于整个架构是统一的,因此可以端到端训练。实验结果表明,所提出的跟踪器TRAT(注意跟踪)在大多数基准上都实现了最先进的性能,并且显着优于基线原子跟踪器。
Robust object tracking requires knowledge of tracked objects' appearance, motion and their evolution over time. Although motion provides distinctive and complementary information especially for fast moving objects, most of the recent tracking architectures primarily focus on the objects' appearance information. In this paper, we propose a two-stream deep neural network tracker that uses both spatial and temporal features. Our architecture is developed over ATOM tracker and contains two backbones: (i) 2D-CNN network to capture appearance features and (ii) 3D-CNN network to capture motion features. The features returned by the two networks are then fused with attention based Feature Aggregation Module (FAM). Since the whole architecture is unified, it can be trained end-to-end. The experimental results show that the proposed tracker TRAT (TRacking by ATtention) achieves state-of-the-art performance on most of the benchmarks and it significantly outperforms the baseline ATOM tracker.