论文标题
对预训练语言模型的不同基础量化方案的调查
An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models
论文作者
论文摘要
最近,诸如BERT之类的预训练的语言模型在多个自然语言处理任务上表现出了有希望的表现。但是,由于尺寸巨大,这些模型的应用受到限制。为了降低其大小,一种流行而有效的方法是量化。然而,大多数侧重于BERT量化的作品适应了原始线性聚类作为量化方案,而很少有作品尝试升级它。这显着限制了量化的性能。在本文中,我们实施了K-均值量化,并将其在BERT的固定定量量化上进行了线性量化进行比较。通过比较,我们验证了基础量化方案升级的效果被低估了,并且K-均值量化具有巨大的发展潜力。此外,我们还比较了阿尔伯特模型上的两个量化方案,以探讨不同预训练模型之间的鲁棒性差异。
Recently, pre-trained language models like BERT have shown promising performance on multiple natural language processing tasks. However, the application of these models has been limited due to their huge size. To reduce its size, a popular and efficient way is quantization. Nevertheless, most of the works focusing on BERT quantization adapted primary linear clustering as the quantization scheme, and few works try to upgrade it. That limits the performance of quantization significantly. In this paper, we implement k-means quantization and compare its performance on the fix-precision quantization of BERT with linear quantization. Through the comparison, we verify that the effect of the underlying quantization scheme upgrading is underestimated and there is a huge development potential of k-means quantization. Besides, we also compare the two quantization schemes on ALBERT models to explore the robustness differences between different pre-trained models.