论文标题
使用混合FR-DL方法在实时视频中复杂的人类动作识别
Complex Human Action Recognition in Live Videos Using Hybrid FR-DL Method
论文作者
论文摘要
自动化的人类行动识别是计算机视觉中最具吸引力,最实用的研究领域之一,尽管其计算成本很高。在这样的系统中,人类的作用标记基于视频序列中动作的外观和模式。但是,传统的方法论和经典的神经网络不能使用时间信息来在视频序列中的即将到来的帧中进行动作识别预测。另一方面,预处理阶段的计算成本很高。在本文中,我们通过在输入序列之间自动选择代表性框架来应对预处理阶段的挑战。此外,我们提取代表框架的关键特征,而不是整个功能。我们提出了一种使用背景减法和猪的混合技术,然后采用了深层神经网络和骨骼建模方法。 CNN和LSTM递归网络的组合被考虑用于特征选择和维护先前的信息,最后,SoftMax-KNN分类器用于标记人类活动。我们将模型命名为功能降低和基于深度学习的动作识别方法,或者简而言之。为了评估所提出的方法,我们将UCF数据集用于基准测试,该基准在行动识别研究中广泛使用。该数据集包括野外101个复杂的活动。实验结果表明,与六个最先进的文章相比,准确性和速度方面有了显着改善。
Automated human action recognition is one of the most attractive and practical research fields in computer vision, in spite of its high computational costs. In such systems, the human action labelling is based on the appearance and patterns of the motions in the video sequences; however, the conventional methodologies and classic neural networks cannot use temporal information for action recognition prediction in the upcoming frames in a video sequence. On the other hand, the computational cost of the preprocessing stage is high. In this paper, we address challenges of the preprocessing phase, by an automated selection of representative frames among the input sequences. Furthermore, we extract the key features of the representative frame rather than the entire features. We propose a hybrid technique using background subtraction and HOG, followed by application of a deep neural network and skeletal modelling method. The combination of a CNN and the LSTM recursive network is considered for feature selection and maintaining the previous information, and finally, a Softmax-KNN classifier is used for labelling human activities. We name our model as Feature Reduction & Deep Learning based action recognition method, or FR-DL in short. To evaluate the proposed method, we use the UCF dataset for the benchmarking which is widely-used among researchers in action recognition research. The dataset includes 101 complicated activities in the wild. Experimental results show a significant improvement in terms of accuracy and speed in comparison with six state-of-the-art articles.