利率延伸优化的训练后量化，以进行学习的图像压缩

论文标题

利率延伸优化的训练后量化，以进行学习的图像压缩

Rate-Distortion Optimized Post-Training Quantization for Learned Image Compression

论文作者

Shi, Junqi, Lu, Ming, Ma, Zhan

论文摘要

将浮点神经网络量化为定点表示，对于学习的图像压缩（LIC）至关重要，因为它提高了互操作性的解码一致性并降低了实现时空复杂性。现有的解决方案通常必须重新验证网络进行模型量化，这在某种程度上是耗时且不切实际的。这项工作建议使用训练后量化（PTQ）处理经过预定的现成的LIC模型。从理论上讲，我们证明，PTQ中模型参数的量化诱导的均方根误差（MSE）（例如，重量，偏置和激活）的压缩任务是必不可少的，因此发展了一种新型的速率（R-D）优化的PTQ（RDO-PTQ），以最佳地保持压缩性能。在给定LIC模型的情况下，RDO-PTQ层 - 最终确定量化参数以32位精度（FP32）以8位精度（INT8）转换为定点的原始浮点参数，为此，微小的校准图像集在优化的优化中被压缩以最小化R-D损失。实验揭示了所提出的方法在不同的lics上的出色效率，显示了最接近其浮点的编码性能。我们的方法是一种轻巧和插件的方法，而无需重新培训模型参数，而只是调整量化参数，这对从业人员很有吸引力。这样的RDO-PTQ是一种面向任务的PTQ方案，然后扩展以量化流行的超级分辨率和图像分类模型，其性能损失可忽略不计，进一步证明了我们方法论的概括。相关材料将在https://njuvision.github.io/rdo-ptq上发布。

Quantizing a floating-point neural network to its fixed-point representation is crucial for Learned Image Compression (LIC) because it improves decoding consistency for interoperability and reduces space-time complexity for implementation. Existing solutions often have to retrain the network for model quantization, which is time-consuming and impractical to some extent. This work suggests using Post-Training Quantization (PTQ) to process pretrained, off-the-shelf LIC models. We theoretically prove that minimizing quantization-induced mean square error (MSE) of model parameters (e.g., weight, bias, and activation) in PTQ is sub-optimal for compression tasks and thus develop a novel Rate-Distortion (R-D) Optimized PTQ (RDO-PTQ) to best retain the compression performance. Given a LIC model, RDO-PTQ layer-wisely determines the quantization parameters to transform the original floating-point parameters in 32-bit precision (FP32) to fixed-point ones at 8-bit precision (INT8), for which a tiny calibration image set is compressed in optimization to minimize R-D loss. Experiments reveal the outstanding efficiency of the proposed method on different LICs, showing the closest coding performance to their floating-point counterparts. Our method is a lightweight and plug-and-play approach without retraining model parameters but just adjusting quantization parameters, which is attractive to practitioners. Such an RDO-PTQ is a task-oriented PTQ scheme, which is then extended to quantize popular super-resolution and image classification models with negligible performance loss, further evidencing the generalization of our methodology. Related materials will be released at https://njuvision.github.io/RDO-PTQ.

下载PDF全文

下载文献需遵守相关版权规定

论文标题