基于参考的图像和视频超分辨率通过C2匹配

论文标题

基于参考的图像和视频超分辨率通过C2匹配

Reference-based Image and Video Super-Resolution via C2-Matching

论文作者

Jiang, Yuming, Chan, Kelvin C. K., Wang, Xintao, Loy, Chen Change, Liu, Ziwei

论文摘要

基于参考的超分辨率（REF-SR）最近出现了一种有希望的范式，通过引入附加的高分辨率（HR）参考图像来增强低分辨率（LR）输入图像或视频。现有的Ref-SR方法主要依赖于隐式对应匹配来从参考图像借用HR纹理，以补偿输入图像中的信息损失。但是，由于输入和参考图像之间存在两个差距：转换间隙（例如，比例和旋转）和分辨率差距（例如HR和LR），因此执行本地转移非常困难。为了应对这些挑战，我们在这项工作中提出了C2匹配，该工作执行了明确的匹配交叉转换和解决方案。 1）为了弥合转换差距，我们提出了一个对应的对应网络，该网络使用输入图像的增强视图来学习转换式对应关系。 2）为了解决分辨率差距，我们采用了教师学生的相关蒸馏，从而使知识从更轻松的HR-HR匹配中提取，以指导更模棱两可的LR-HR匹配。 3）最后，我们设计了一个动态聚合模块，以解决输入图像和参考图像之间的潜在未对准问题。此外，为了忠实评估在现实设置下基于参考的图像超分辨率的性能，我们贡献了Webly引用的SR（WR-SR）数据集，从而模仿了实际用法方案。我们还将C2匹配扩展到基于参考的视频超分辨率任务，其中在类似场景中拍摄的图像用作HR参考图像。广泛的实验表明，我们提出的C2匹配在标准CUFED5基准上显着优于艺术状态，并通过将C2匹配组件纳入视频SR Pipelines来提高视频SR的性能。

Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image or video by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e.g., scale and rotation) and the resolution gap (e.g., HR and LR). To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution. 1) To bridge the transformation gap, we propose a contrastive correspondence network, which learns transformation-robust correspondences using augmented views of the input image. 2) To address the resolution gap, we adopt teacher-student correlation distillation, which distills knowledge from the easier HR-HR matching to guide the more ambiguous LR-HR matching. 3) Finally, we design a dynamic aggregation module to address the potential misalignment issue between input images and reference images. In addition, to faithfully evaluate the performance of Reference-based Image Super-Resolution under a realistic setting, we contribute the Webly-Referenced SR (WR-SR) dataset, mimicking the practical usage scenario. We also extend C2-Matching to Reference-based Video Super-Resolution task, where an image taken in a similar scene serves as the HR reference image. Extensive experiments demonstrate that our proposed C2-Matching significantly outperforms state of the arts on the standard CUFED5 benchmark and also boosts the performance of video SR by incorporating the C2-Matching component into Video SR pipelines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题