论文标题
Stepnet:用于孤立手语识别的时空零件感知网络
StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition
论文作者
论文摘要
手语识别(SLR)的目标是帮助那些难以听见或聋哑的人克服沟通障碍。大多数现有方法通常可以分为两行,即基于骨架的方法和基于RGB的方法,但是这两种方法都有其局限性。基于骨架的方法不考虑面部表情,而基于RGB的方法通常忽略了细粒的手部结构。为了克服这两个局限性,我们提出了一个基于RGB零件的新框架,称为“时空零件感知网络”(Stepnet)。顾名思义,它由两个模块组成:零件级的空间建模和零件级的时间建模。尤其是,零件级的空间建模会自动在特征空间中自动捕获基于外观的属性,例如手和面,而无需使用任何关键点级注释。另一方面,零件级的时间建模隐式地矿山延伸术语上下文以捕获随着时间的推移捕获相关属性。广泛的实验表明,我们的继网络得益于时空模块,在三个常用的SLR基准上实现了竞争性的TOP-1 /效率准确性,即WLASL的56.89%,NMFS-CSL的竞争性占56.89%,在NMFS-CSL上为77.2%,在Bobsl上为77.1%。此外,所提出的方法与光流输入兼容,如果融合,可以产生较高的性能。对于那些难以听到的人,我们希望我们的工作能够成为初步的步骤。
The goal of sign language recognition (SLR) is to help those who are hard of hearing or deaf overcome the communication barrier. Most existing approaches can be typically divided into two lines, i.e., Skeleton-based and RGB-based methods, but both the two lines of methods have their limitations. Skeleton-based methods do not consider facial expressions, while RGB-based approaches usually ignore the fine-grained hand structure. To overcome both limitations, we propose a new framework called Spatial-temporal Part-aware network~(StepNet), based on RGB parts. As its name suggests, it is made up of two modules: Part-level Spatial Modeling and Part-level Temporal Modeling. Part-level Spatial Modeling, in particular, automatically captures the appearance-based properties, such as hands and faces, in the feature space without the use of any keypoint-level annotations. On the other hand, Part-level Temporal Modeling implicitly mines the long-short term context to capture the relevant attributes over time. Extensive experiments demonstrate that our StepNet, thanks to spatial-temporal modules, achieves competitive Top-1 Per-instance accuracy on three commonly-used SLR benchmarks, i.e., 56.89% on WLASL, 77.2% on NMFs-CSL, and 77.1% on BOBSL. Additionally, the proposed method is compatible with the optical flow input and can produce superior performance if fused. For those who are hard of hearing, we hope that our work can act as a preliminary step.