论文标题

用于量化神经网络的贪婪算法

A Greedy Algorithm for Quantizing Neural Networks

论文作者

Lybrand, Eric, Saab, Rayan

论文摘要

我们提出了一种新的计算有效方法,用于量化预训练的神经网络的权重,该方法足以处理多层感知器和卷积神经网络。我们的方法确定性地以迭代方式量化层,而无需复杂的重新训练。具体而言,我们使用贪婪的路径遵循算法对每个神经元或隐藏单元进行量化。这种简单的算法等同于运行动态系统,我们证明该系统是稳定的,可以量化单层神经网络(或者,或者,在训练数据是高斯时,可以量化多层网络的第一层)。我们表明,在这些假设下,量化误差随层的宽度(即其过度参数水平)衰减。我们在多层网络上提供数值实验,以说明我们在MNIST和CIFAR10数据上的方法的性能,以及使用ImageNet数据量化VGG16网络的性能。

We propose a new computationally efficient method for quantizing the weights of pre- trained neural networks that is general enough to handle both multi-layer perceptrons and convolutional neural networks. Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required. Specifically, we quantize each neuron, or hidden unit, using a greedy path-following algorithm. This simple algorithm is equivalent to running a dynamical system, which we prove is stable for quantizing a single-layer neural network (or, alternatively, for quantizing the first layer of a multi-layer network) when the training data are Gaussian. We show that under these assumptions, the quantization error decays with the width of the layer, i.e., its level of over-parametrization. We provide numerical experiments, on multi-layer networks, to illustrate the performance of our methods on MNIST and CIFAR10 data, as well as for quantizing the VGG16 network using ImageNet data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源