论文标题

探索用于排毒大规模语言模型的领域自适应培训的极限

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

论文作者

Wang, Boxin, Ping, Wei, Xiao, Chaowei, Xu, Peng, Patwary, Mostofa, Shoeybi, Mohammad, Li, Bo, Anandkumar, Anima, Catanzaro, Bryan

论文摘要

预训练的语言模型(LMS)被证明可以轻松产生有毒语言。在这项工作中,我们系统地探索了域自适应训练,以减少语言模型的毒性。我们对三个维度进行了这项研究:训练语料库,模型大小和参数效率。对于培训语料库,我们建议利用LMS的生成能力并生成无毒数据集进行域自适应训练,这比使用策划的预训练语料库相比,这可以减轻暴露偏见,并显示出更高的数据效率。我们证明,即使使用1/3较小的训练语料库,自动生成方法也始终优于自动和人类评估的各种模型大小的现有基准。然后,我们全面研究具有参数尺寸的LMS的排毒量,范围从126m到530b(比GPT-3大3倍),这一比例从未研究过。我们发现,I)大的LM具有与较小的毒性水平相似的毒性水平,而在相同的训练之前,II)大的LMS需要更多的努力才能排毒。我们还探索了参数有效的训练方法以进行排毒。我们证明,在LMS中增加和训练适配器的层不仅节省了很多参数,而且比对于大规模模型的整个模型适应性,在毒性和困惑之间实现了更好的权衡。

Pre-trained language models (LMs) are shown to easily generate toxic language. In this work, we systematically explore domain-adaptive training to reduce the toxicity of language models. We conduct this study on three dimensions: training corpus, model size, and parameter efficiency. For the training corpus, we propose to leverage the generative power of LMs and generate nontoxic datasets for domain-adaptive training, which mitigates the exposure bias and is shown to be more data-efficient than using a curated pre-training corpus. We demonstrate that the self-generation method consistently outperforms the existing baselines across various model sizes on both automatic and human evaluations, even when it uses a 1/3 smaller training corpus. We then comprehensively study detoxifying LMs with parameter sizes ranging from 126M up to 530B (3x larger than GPT-3), a scale that has never been studied before. We find that i) large LMs have similar toxicity levels as smaller ones given the same pre-training corpus, and ii) large LMs require more endeavor to detoxify. We also explore parameter-efficient training methods for detoxification. We demonstrate that adding and training adapter-only layers in LMs not only saves a lot of parameters but also achieves a better trade-off between toxicity and perplexity than whole model adaptation for the large-scale models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源