深度神经网络的基于标准偏差的量化

论文标题

深度神经网络的基于标准偏差的量化

Standard Deviation-Based Quantization for Deep Neural Networks

论文作者

Ardakani, Amir, Ardakani, Arash, Meyer, Brett, Clark, James J., Gross, Warren J.

论文摘要

深度神经网络的量化是一种有希望的方法，可降低推理成本，使得在资源限制的设备上运行深层网络是可行的。受现有方法的启发，我们提出了一个新框架，使用网络的权重和激活分布（即标准偏差）了解量化间隔（离散值）。此外，我们提出了一种新型的基本-2对数量化方案，以将权重量化为两个离散值。我们提出的计划使我们能够用简单的Shift-ADD操作代替渴望资源的高精度乘数。根据我们的评估，与全精度模型相比，我们的方法在CIFAR10和Imagenet数据集上的现有工作优于现有的工作，甚至可以通过3位权重和激活来实现更好的准确性性能。此外，我们的方案同时修剪网络的参数，并使我们能够在量化过程中灵活调整修剪比。

Quantization of deep neural networks is a promising approach that reduces the inference cost, making it feasible to run deep networks on resource-restricted devices. Inspired by existing methods, we propose a new framework to learn the quantization intervals (discrete values) using the knowledge of the network's weight and activation distributions, i.e., standard deviation. Furthermore, we propose a novel base-2 logarithmic quantization scheme to quantize weights to power-of-two discrete values. Our proposed scheme allows us to replace resource-hungry high-precision multipliers with simple shift-add operations. According to our evaluations, our method outperforms existing work on CIFAR10 and ImageNet datasets and even achieves better accuracy performance with 3-bit weights and activations when compared to the full-precision models. Moreover, our scheme simultaneously prunes the network's parameters and allows us to flexibly adjust the pruning ratio during the quantization process.

下载PDF全文

下载文献需遵守相关版权规定

论文标题