论文标题

以运动为重点的对比度学习视频表示形式

Motion-Focused Contrastive Learning of Video Representations

论文作者

Li, Rui, Zhang, Yiheng, Qiu, Zhaofan, Yao, Ting, Liu, Dong, Mei, Tao

论文摘要

作为视频中最鲜明的现象涉及随着时间的变化,运动对于视频表示学习的发展至关重要。在本文中,我们提出了一个问题:运动特别对自我监视的视频表示学习的重要性。为此,我们撰写了一顿二重奏,以利用动作进行数据增强和在对比度学习制度中的特征学习。具体而言,我们提出了一种以运动为中心的对比学习(MCL)方法,该方法将二重奏视为基础。一方面,MCL将视频中的每个帧的光流都大写,以时间和空间对小管(即,跨时间的相关框架序列的序列)作为数据增强进行采样。另一方面,MCL进一步将卷积层的梯度图与从空间,时空和时空透视的光流图相结合,以便在特征学习中接地运动信息。在R(2+1)D主链上进行的广泛实验证明了我们MCL的有效性。在UCF101上,对MCL学到的表示的线性分类器训练了81.91%的TOP-1准确性,表现优于监督预训练的Imagenet的6.78%。在Kinetics-400上,MCL在线性方案下达到66.62%的TOP-1准确性。代码可在https://github.com/yihengzhang-cv/mcl-motion-focused-contastive-learning上找到。

Motion, as the most distinct phenomenon in a video to involve the changes over time, has been unique and critical to the development of video representation learning. In this paper, we ask the question: how important is the motion particularly for self-supervised video representation learning. To this end, we compose a duet of exploiting the motion for data augmentation and feature learning in the regime of contrastive learning. Specifically, we present a Motion-focused Contrastive Learning (MCL) method that regards such duet as the foundation. On one hand, MCL capitalizes on optical flow of each frame in a video to temporally and spatially sample the tubelets (i.e., sequences of associated frame patches across time) as data augmentations. On the other hand, MCL further aligns gradient maps of the convolutional layers to optical flow maps from spatial, temporal and spatio-temporal perspectives, in order to ground motion information in feature learning. Extensive experiments conducted on R(2+1)D backbone demonstrate the effectiveness of our MCL. On UCF101, the linear classifier trained on the representations learnt by MCL achieves 81.91% top-1 accuracy, outperforming ImageNet supervised pre-training by 6.78%. On Kinetics-400, MCL achieves 66.62% top-1 accuracy under the linear protocol. Code is available at https://github.com/YihengZhang-CV/MCL-Motion-Focused-Contrastive-Learning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源