论文标题

Q-VIT:准确且完全量化的低位视觉变压器

Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

论文作者

Li, Yanjing, Xu, Sheng, Zhang, Baochang, Cao, Xianbin, Gao, Peng, Guo, Guodong

论文摘要

大型预训练的视觉变压器(VIT)在各种视觉任务上表现出了出色的性能,但是当部署在资源约束设备上时,却遭受了昂贵的计算和内存成本问题。在强大的压缩方法中,量化极大地降低了通过低位参数和位于位操作的计算和内存消耗。但是,低位VIT在很大程度上尚未探索,并且与实际价值的同行相比,通常遭受了显着的性能下降。在这项工作中,通过广泛的经验分析,我们首先确定了严重性能下降的瓶颈,这来自低位量化的自我发场映射的信息失真。然后,我们为完全量化的视觉变压器(Q-VIT)开发信息整流模块(IRM)和分布引导的蒸馏(DGD)方案,以有效消除这种失真,从而导致完全量化的VIT。我们评估了有关流行的神经和SWIN骨架的方法。广泛的实验结果表明,我们的方法的性能要比先前的艺术要好得多。例如,我们的Q-VIT理论上可以使VIT-S加速6.14倍,并达到约80.9%的TOP-1准确性,甚至超过ImageNet数据集中的全精度对应率1.0%。我们的代码和模型附加在https://github.com/yanjingli0202/q-vit上

The large pre-trained vision transformers (ViTs) have demonstrated remarkable performance on various visual tasks, but suffer from expensive computational and memory cost problems when deployed on resource-constrained devices. Among the powerful compression approaches, quantization extremely reduces the computation and memory consumption by low-bit parameters and bit-wise operations. However, low-bit ViTs remain largely unexplored and usually suffer from a significant performance drop compared with the real-valued counterparts. In this work, through extensive empirical analysis, we first identify the bottleneck for severe performance drop comes from the information distortion of the low-bit quantized self-attention map. We then develop an information rectification module (IRM) and a distribution guided distillation (DGD) scheme for fully quantized vision transformers (Q-ViT) to effectively eliminate such distortion, leading to a fully quantized ViTs. We evaluate our methods on popular DeiT and Swin backbones. Extensive experimental results show that our method achieves a much better performance than the prior arts. For example, our Q-ViT can theoretically accelerates the ViT-S by 6.14x and achieves about 80.9% Top-1 accuracy, even surpassing the full-precision counterpart by 1.0% on ImageNet dataset. Our codes and models are attached on https://github.com/YanjingLi0202/Q-ViT

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源