论文标题
Transnet V2:有效的深层网络架构,用于快速射击过渡检测
TransNet V2: An effective deep network architecture for fast shot transition detection
论文作者
论文摘要
尽管已经研究了自动射击过渡检测方法已有二十多年了,但尚未提出有效的通用人类水平模型。即使对于常见的镜头过渡,例如硬切割或简单的逐渐变化,分析的视频内容的潜在多样性仍然可能导致虚假命中和错误的解雇。最近,基于深度学习的方法可大大提高了使用3D卷积架构和人为创建的训练数据的射击过渡检测的准确性。然而,百分之一百仍然是无法达到的理想。在本文中,我们分享了当前的深网Transnet V2的当前版本,该版本在受人尊敬的基准上达到最先进的性能。提供了该模型的训练有素的实例,因此社区可以立即利用它来对大型视频档案进行高效分析。此外,网络体系结构以及我们在培训过程中的经验都详细介绍了,包括简单的代码段,以方便地使用拟议的模型和结果的可视化。
Although automatic shot transition detection approaches are already investigated for more than two decades, an effective universal human-level model was not proposed yet. Even for common shot transitions like hard cuts or simple gradual changes, the potential diversity of analyzed video contents may still lead to both false hits and false dismissals. Recently, deep learning-based approaches significantly improved the accuracy of shot transition detection using 3D convolutional architectures and artificially created training data. Nevertheless, one hundred percent accuracy is still an unreachable ideal. In this paper, we share the current version of our deep network TransNet V2 that reaches state-of-the-art performance on respected benchmarks. A trained instance of the model is provided so it can be instantly utilized by the community for a highly efficient analysis of large video archives. Furthermore, the network architecture, as well as our experience with the training process, are detailed, including simple code snippets for convenient usage of the proposed model and visualization of results.