空间周期对齐网络用于行动识别和检测

论文标题

空间周期对齐网络用于行动识别和检测

Spatial-Temporal Alignment Network for Action Recognition and Detection

论文作者

Liang, Junwei, Cao, Liangliang, Xiong, Xuehan, Yu, Ting, Hauptmann, Alexander

论文摘要

本文研究了如何介绍可以帮助动作识别和检测的视点不变特征表示。尽管在过去十年中，我们目睹了行动识别的巨大进展，但如何有效地对大规模数据集中的几何变化进行建模仍然充满挑战而有趣。本文提出了一个新型的时空对准网络（Stan），旨在学习用于行动识别和动作检测的几何不变表示。 Stan模型非常加权且通用，可以将其插入现有的动作识别模型，例如Resnet3D和Slowfast，并具有非常低的额外计算成本。我们对AVA，Kinetics-400，Ava-Kinetics，Charades和Charades-Ego数据集进行了广泛的测试。实验结果表明，在动作检测和动作识别任务中，Stan模型可以始终如一地改善艺术状态。我们将发布我们的数据，模型和代码。

This paper studies how to introduce viewpoint-invariant feature representations that can help action recognition and detection. Although we have witnessed great progress of action recognition in the past decade, it remains challenging yet interesting how to efficiently model the geometric variations in large scale datasets. This paper proposes a novel Spatial-Temporal Alignment Network (STAN) that aims to learn geometric invariant representations for action recognition and action detection. The STAN model is very light-weighted and generic, which could be plugged into existing action recognition models like ResNet3D and the SlowFast with a very low extra computational cost. We test our STAN model extensively on AVA, Kinetics-400, AVA-Kinetics, Charades, and Charades-Ego datasets. The experimental results show that the STAN model can consistently improve the state of the arts in both action detection and action recognition tasks. We will release our data, models and code.

下载PDF全文

下载文献需遵守相关版权规定

论文标题