论文标题

与隐式的时间对齐和配对相似性优化的几乎没有射击动作识别

Few-shot Action Recognition with Implicit Temporal Alignment and Pair Similarity Optimization

论文作者

Cao, Congqi, Li, Yajuan, Lv, Qinyi, Wang, Peng, Zhang, Yanning

论文摘要

很少有学习的旨在认识到几乎没有标记样品的新颖类中的实例,这些样本在研究和应用中具有巨大的价值。尽管最近在这一领域进行了很多工作,但大多数现有工作都是基于图像分类任务。基于视频的几次动作识别尚未得到很好的探索,并且仍然具有挑战性:1)不同论文之间实施细节的差异使得公平比较变得困难; 2)时间序列的广泛变化和未对准使视频级别的相似性比较变得困难; 3)标记数据的稀缺性使优化变得困难。为了解决这些问题,本文介绍了1)评估少数动作识别算法的性能的特定设置; 2)一种隐式序列对齐算法,用于更好的视频级相似性比较; 3)几次学习的高级损失,可以用有限的数据优化对的相似性。具体而言,我们提出了一个新颖的几弹性动作识别框架,该框架在3D卷积层之后使用长期的短期记忆进行序列建模和对齐。引入了圆损失,以最大程度地提高阶层内相似性,并最大程度地减少与更确定的收敛目标灵活地相似性。我们不使用随机或模棱两可的实验设置,而是设置了类似于标准图像基于图像的几弹性学习设置的具体标准,以进行几次射击动作识别评估。在两个数据集上进行的大量实验证明了我们提出的方法的有效性。

Few-shot learning aims to recognize instances from novel classes with few labeled samples, which has great value in research and application. Although there has been a lot of work in this area recently, most of the existing work is based on image classification tasks. Video-based few-shot action recognition has not been explored well and remains challenging: 1) the differences of implementation details among different papers make a fair comparison difficult; 2) the wide variations and misalignment of temporal sequences make the video-level similarity comparison difficult; 3) the scarcity of labeled data makes the optimization difficult. To solve these problems, this paper presents 1) a specific setting to evaluate the performance of few-shot action recognition algorithms; 2) an implicit sequence-alignment algorithm for better video-level similarity comparison; 3) an advanced loss for few-shot learning to optimize pair similarity with limited data. Specifically, we propose a novel few-shot action recognition framework that uses long short-term memory following 3D convolutional layers for sequence modeling and alignment. Circle loss is introduced to maximize the within-class similarity and minimize the between-class similarity flexibly towards a more definite convergence target. Instead of using random or ambiguous experimental settings, we set a concrete criterion analogous to the standard image-based few-shot learning setting for few-shot action recognition evaluation. Extensive experiments on two datasets demonstrate the effectiveness of our proposed method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源