用于量化神经网络的贪婪算法

论文标题

用于量化神经网络的贪婪算法

A Greedy Algorithm for Quantizing Neural Networks

论文作者

Lybrand, Eric, Saab, Rayan

论文摘要

我们提出了一种新的计算有效方法，用于量化预训练的神经网络的权重，该方法足以处理多层感知器和卷积神经网络。我们的方法确定性地以迭代方式量化层，而无需复杂的重新训练。具体而言，我们使用贪婪的路径遵循算法对每个神经元或隐藏单元进行量化。这种简单的算法等同于运行动态系统，我们证明该系统是稳定的，可以量化单层神经网络（或者，或者，在训练数据是高斯时，可以量化多层网络的第一层）。我们表明，在这些假设下，量化误差随层的宽度（即其过度参数水平）衰减。我们在多层网络上提供数值实验，以说明我们在MNIST和CIFAR10数据上的方法的性能，以及使用ImageNet数据量化VGG16网络的性能。

We propose a new computationally efficient method for quantizing the weights of pre- trained neural networks that is general enough to handle both multi-layer perceptrons and convolutional neural networks. Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required. Specifically, we quantize each neuron, or hidden unit, using a greedy path-following algorithm. This simple algorithm is equivalent to running a dynamical system, which we prove is stable for quantizing a single-layer neural network (or, alternatively, for quantizing the first layer of a multi-layer network) when the training data are Gaussian. We show that under these assumptions, the quantization error decays with the width of the layer, i.e., its level of over-parametrization. We provide numerical experiments, on multi-layer networks, to illustrate the performance of our methods on MNIST and CIFAR10 data, as well as for quantizing the VGG16 network using ImageNet data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题