基于卷积神经网络的扩张基于视频压缩的深度参考图像生成

论文标题

基于卷积神经网络的扩张基于视频压缩的深度参考图像生成

Dilated convolutional neural network-based deep reference picture generation for video compression

论文作者

Tian, Haoyue, Gao, Pan, Wei, Ran, Paul, Manoranjan

论文摘要

运动估计和运动补偿是视频编码中相互预测的必不可少的部分。由于对象的运动向量大多在分数像素单元中，因此原始参考图片可能无法准确提供运动补偿的合适参考。在本文中，我们提出了一个深层参考图片生成器，该生成器可以创建与当前编码框架更相关的图片，从而进一步降低时间冗余并提高视频压缩效率。受卷积神经网络（CNN）的最新进展的启发，本文提议使用扩张的CNN来构建发电机。此外，我们将生成的深层图片插入多功能视频编码（VVC）作为参考图片中，并执行一组全面的实验，以评估我们网络对最新的VVC测试模型VTM的有效性。实验结果表明，我们所提出的方法在低延迟P配置下，与VVC相比，平均得出9.7％的位。

Motion estimation and motion compensation are indispensable parts of inter prediction in video coding. Since the motion vector of objects is mostly in fractional pixel units, original reference pictures may not accurately provide a suitable reference for motion compensation. In this paper, we propose a deep reference picture generator which can create a picture that is more relevant to the current encoding frame, thereby further reducing temporal redundancy and improving video compression efficiency. Inspired by the recent progress of Convolutional Neural Network(CNN), this paper proposes to use a dilated CNN to build the generator. Moreover, we insert the generated deep picture into Versatile Video Coding(VVC) as a reference picture and perform a comprehensive set of experiments to evaluate the effectiveness of our network on the latest VVC Test Model VTM. The experimental results demonstrate that our proposed method achieves on average 9.7% bit saving compared with VVC under low-delay P configuration.

下载PDF全文

下载文献需遵守相关版权规定

论文标题