用于降低基于内核的插值架构参数的多编码网络

论文标题

用于降低基于内核的插值架构参数的多编码网络

Multi-encoder Network for Parameter Reduction of a Kernel-based Interpolation Architecture

论文作者

Khalifeh, Issa, Blanch, Marc Gorriz, Izquierdo, Ebroul, Mrak, Marta

论文摘要

视频框架插值涉及现有框架的新框架的合成。卷积神经网络（CNN）一直处于该领域最近进步的最前沿。一种流行的基于CNN的方法涉及将生成的内核应用于输入帧以获得插值框架。尽管插值方法提供了所有好处，但其中许多网络都需要大量参数，其中更多的参数意味着较重的计算负担。减少模型的大小通常会对性能产生负面影响。本文介绍了一种基于流行的无流量内核网络（流动的自适应协作）的参数减少方法。通过我们删除需要最多参数并用较小编码器代替它们的层的技术，我们减少了网络参数的数量，甚至与原始方法相比，甚至可以实现更好的性能。这是通过部署旋转来实现的，以迫使每个单独的编码器从输入图像中学习不同的功能。进行消融是为了证明设计选择的合理性，并对我们的方法在全长视频中的执行方式进行评估。

Video frame interpolation involves the synthesis of new frames from existing ones. Convolutional neural networks (CNNs) have been at the forefront of the recent advances in this field. One popular CNN-based approach involves the application of generated kernels to the input frames to obtain an interpolated frame. Despite all the benefits interpolation methods offer, many of these networks require a lot of parameters, with more parameters meaning a heavier computational burden. Reducing the size of the model typically impacts performance negatively. This paper presents a method for parameter reduction for a popular flow-less kernel-based network (Adaptive Collaboration of Flows). Through our technique of removing the layers that require the most parameters and replacing them with smaller encoders, we reduce the number of parameters of the network and even achieve better performance compared to the original method. This is achieved by deploying rotation to force each individual encoder to learn different features from the input images. Ablations are conducted to justify design choices and an evaluation on how our method performs on full-length videos is presented.

下载PDF全文

下载文献需遵守相关版权规定

论文标题