论文标题
可切换精度神经网络
Switchable Precision Neural Networks
论文作者
论文摘要
瞬时和需求准确性效率的权衡最近在苗条的神经网络的背景下进行了探索。在本文中,我们提出了一种灵活的量化策略,称为可切换精度神经网络(SP-NETS),以训练能够以多个量化水平运行的共享网络。在运行时,网络可以根据即时内存,延迟,功耗和准确性需求即可随时调整其精度。例如,通过将网络权重限制为1位以可切换的精度激活,我们的共享网络跨越了从二进制连接到二进制神经网络,从而允许仅使用Sumpations或BIT操作执行点产品。此外,提出了一种自我验证方案,以提高量化开关的性能。我们使用三个不同的量化器测试了我们的方法,并使用RESNET-18和MOBILENET体系结构证明了针对受独立训练的量化量化模型的性能。
Instantaneous and on demand accuracy-efficiency trade-off has been recently explored in the context of neural networks slimming. In this paper, we propose a flexible quantization strategy, termed Switchable Precision neural Networks (SP-Nets), to train a shared network capable of operating at multiple quantization levels. At runtime, the network can adjust its precision on the fly according to instant memory, latency, power consumption and accuracy demands. For example, by constraining the network weights to 1-bit with switchable precision activations, our shared network spans from BinaryConnect to Binarized Neural Network, allowing to perform dot-products using only summations or bit operations. In addition, a self-distillation scheme is proposed to increase the performance of the quantized switches. We tested our approach with three different quantizers and demonstrate the performance of SP-Nets against independently trained quantized models in classification accuracy for Tiny ImageNet and ImageNet datasets using ResNet-18 and MobileNet architectures.