论文标题

始发时间:朝着准确有效的低位两次量化

DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization

论文作者

Li, Xinlin, Liu, Bang, Yang, Rui Heng, Courville, Vanessa, Xing, Chao, Nia, Vahid Partovi

论文摘要

由于其不断增加的资源需求,在低资源边缘设备上有效部署深层神经网络是具有挑战性的。为了解决这个问题,研究人员提出了无繁殖的神经网络,例如两次量化,或称为移动网络,旨在减少记忆使用情况并简化计算。但是,现有的低位偏移网络不如其全精度对应物准确,通常遭受编码方案的重量范围有限和量化损失。在本文中,我们提出了始发网络,该网络可显着提高转移网络的准确性,从而为视觉和语音应用实现竞争性能。此外,我们引入了一种使用非量化的浮点激活来部署有效的密度速度网络的方法,同时在现有方法上获得了1.6倍的速度。为了实现这一目标,我们证明了低位移位网络中的零重量值并不促进模型容量,并对推理计算产生负面影响。为了解决这个问题,我们提出了一种无零的转移机制,可简化推理并增加模型容量。我们进一步提出了符号尺度分解设计,以提高训练效率和低变化的随机初始化策略,以提高模型的转移学习绩效。我们在各种计算机视觉和语音任务上进行的广泛实验表明,与完整精确的网络相比,DenShift的表现优于现有的低位乘法网络,并实现竞争性能。此外,我们提出的方法表现出强大的转移学习表现,而没有准确性下降。我们的代码在Github上发布。

Efficiently deploying deep neural networks on low-resource edge devices is challenging due to their ever-increasing resource requirements. To address this issue, researchers have proposed multiplication-free neural networks, such as Power-of-Two quantization, or also known as Shift networks, which aim to reduce memory usage and simplify computation. However, existing low-bit Shift networks are not as accurate as their full-precision counterparts, typically suffering from limited weight range encoding schemes and quantization loss. In this paper, we propose the DenseShift network, which significantly improves the accuracy of Shift networks, achieving competitive performance to full-precision networks for vision and speech applications. In addition, we introduce a method to deploy an efficient DenseShift network using non-quantized floating-point activations, while obtaining 1.6X speed-up over existing methods. To achieve this, we demonstrate that zero-weight values in low-bit Shift networks do not contribute to model capacity and negatively impact inference computation. To address this issue, we propose a zero-free shifting mechanism that simplifies inference and increases model capacity. We further propose a sign-scale decomposition design to enhance training efficiency and a low-variance random initialization strategy to improve the model's transfer learning performance. Our extensive experiments on various computer vision and speech tasks demonstrate that DenseShift outperforms existing low-bit multiplication-free networks and achieves competitive performance compared to full-precision networks. Furthermore, our proposed approach exhibits strong transfer learning performance without a drop in accuracy. Our code was released on GitHub.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源