轻巧的模块，用于有效的基于深度学习的图像修复

论文标题

轻巧的模块，用于有效的基于深度学习的图像修复

Lightweight Modules for Efficient Deep Learning based Image Restoration

论文作者

Lahiri, Avisek, Bairagya, Sourav, Bera, Sutanu, Haldar, Siddhant, Biswas, Prabir Kumar

论文摘要

低水平的图像修复是现代人工智能（AI）驱动的相机管道的组成部分。这些框架中的大多数都是基于深层神经网络，这些神经网络在资源约束平台（如手机）上呈现大量的计算开销。在本文中，我们提出了几个轻巧的低级模块，这些模块可用于创建给定基线模型的计算低成本变体。有效的神经网络设计的最新作品主要集中在分类上。但是，低级图像处理属于图像到图像的翻译类型，这需要分类中不存在一些其他计算模块。本文旨在通过设计通用有效的模块来弥合这一差距，这些模块可以替换基于当代深度学习的图像恢复网络中使用的基本组件。我们还提出并分析了我们的结果，突出了将深度分离的卷积内核（一种有效分类网络的流行方法）用于基于子像素卷积的上采样（用于低级视觉应用的流行上采样策略）的缺点。这表明，来自分类领域的概念不能总是无缝集成到图像到图像翻译任务中。我们广泛验证了三个流行的图像介绍，降解和超分辨率的调查结果。我们的结果表明，与全容量基线相比，建议的网络始终在视觉上输出相似的重建，而当代移动设备上的参数，内存足迹和执行速度的大幅降低。

Low level image restoration is an integral component of modern artificial intelligence (AI) driven camera pipelines. Most of these frameworks are based on deep neural networks which present a massive computational overhead on resource constrained platform like a mobile phone. In this paper, we propose several lightweight low-level modules which can be used to create a computationally low cost variant of a given baseline model. Recent works for efficient neural networks design have mainly focused on classification. However, low-level image processing falls under the image-to-image' translation genre which requires some additional computational modules not present in classification. This paper seeks to bridge this gap by designing generic efficient modules which can replace essential components used in contemporary deep learning based image restoration networks. We also present and analyse our results highlighting the drawbacks of applying depthwise separable convolutional kernel (a popular method for efficient classification network) for sub-pixel convolution based upsampling (a popular upsampling strategy for low-level vision applications). This shows that concepts from domain of classification cannot always be seamlessly integrated into image-to-image translation tasks. We extensively validate our findings on three popular tasks of image inpainting, denoising and super-resolution. Our results show that proposed networks consistently output visually similar reconstructions compared to full capacity baselines with significant reduction of parameters, memory footprint and execution speeds on contemporary mobile devices.

下载PDF全文

下载文献需遵守相关版权规定

论文标题