视频异常检测通过求解脱钩时空拼图拼图

论文标题

视频异常检测通过求解脱钩时空拼图拼图

Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles

论文作者

Wang, Guodong, Wang, Yunhong, Qin, Jie, Zhang, Dongming, Bao, Xiuguo, Huang, Di

论文摘要

视频异常检测（VAD）是计算机视觉中的重要主题。本文通过最新的自我监督学习进展的激励，通过解决直观而又具有挑战性的借口任务，即时空拼图拼图来解决VAD，该任务是一个多标签的精细粒度分类问题。我们的方法比现有作品具有多个优点：1）时空拼图难题是根据空间和时间维度分离的，分别捕获了高度歧视性的外观和运动特征； 2）完整排列用于提供涵盖各种难度水平的丰富拼图难题，从而使网络可以区分正常事件和异常事件之间的细微时空差异； 3）借口任务以端到端的方式解决，而无需依靠任何预训练的模型。我们的方法优于三个公共基准的最先进的方法。尤其是在上海校园中，结果优于重建和基于预测的方法，其余量很大。

Video Anomaly Detection (VAD) is an important topic in computer vision. Motivated by the recent advances in self-supervised learning, this paper addresses VAD by solving an intuitive yet challenging pretext task, i.e., spatio-temporal jigsaw puzzles, which is cast as a multi-label fine-grained classification problem. Our method exhibits several advantages over existing works: 1) the spatio-temporal jigsaw puzzles are decoupled in terms of spatial and temporal dimensions, responsible for capturing highly discriminative appearance and motion features, respectively; 2) full permutations are used to provide abundant jigsaw puzzles covering various difficulty levels, allowing the network to distinguish subtle spatio-temporal differences between normal and abnormal events; and 3) the pretext task is tackled in an end-to-end manner without relying on any pre-trained models. Our method outperforms state-of-the-art counterparts on three public benchmarks. Especially on ShanghaiTech Campus, the result is superior to reconstruction and prediction-based methods by a large margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题