关于基于骨架的人类动作识别的时空图卷积网络的空间关注

论文标题

关于基于骨架的人类动作识别的时空图卷积网络的空间关注

On the spatial attention in Spatio-Temporal Graph Convolutional Networks for skeleton-based human action recognition

论文作者

Heidari, Negar, Iosifidis, Alexandros

论文摘要

图形卷积网络（GCN）通过将一系列骨骼作为时空图进行建模，在基于骨架的人类动作识别中实现了有希望的性能。最近提出的基于GCN的大多数基于GCN的方法通过在网络的每一层学习图形结构，使用在预定义的图邻接矩阵上应用的空间注意来改善性能，该空间注意以端到端的方式与模型的参数共同优化。在本文中，我们分析了时空GCN层中使用的空间注意力，并提出了对称的空间注意力，以更好地反映执行动作时人体关节相对位置的对称特性。我们还强调了在双线性层上使用加性空间注意的时空GCN层的连接，我们提出了时空双线性网络（ST-BLN），不需要使用预定义的邻接矩阵，并允许对模型进行更灵活的设计。实验结果表明，这三个模型导致有效的性能。此外，通过利用拟议的ST-BLN提供的灵活性，可以提高模型的效率。

Graph convolutional networks (GCNs) achieved promising performance in skeleton-based human action recognition by modeling a sequence of skeletons as a spatio-temporal graph. Most of the recently proposed GCN-based methods improve the performance by learning the graph structure at each layer of the network using a spatial attention applied on a predefined graph Adjacency matrix that is optimized jointly with model's parameters in an end-to-end manner. In this paper, we analyze the spatial attention used in spatio-temporal GCN layers and propose a symmetric spatial attention for better reflecting the symmetric property of the relative positions of the human body joints when executing actions. We also highlight the connection of spatio-temporal GCN layers employing additive spatial attention to bilinear layers, and we propose the spatio-temporal bilinear network (ST-BLN) which does not require the use of predefined Adjacency matrices and allows for more flexible design of the model. Experimental results show that the three models lead to effectively the same performance. Moreover, by exploiting the flexibility provided by the proposed ST-BLN, one can increase the efficiency of the model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题