论文标题
诱导者调整:连接前缀调节和调整调整器调整
Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning
论文作者
论文摘要
前缀调整或更一般连续的及时调整已成为参数有效传输学习的必要范式。使用大型的预训练语言模型(PLM),前缀调整可以通过仅训练一小部分参数来获得强大的性能。在本文中,我们建议通过内核镜头理解并进一步发展前缀调节。具体而言,我们在\ textit {prefixes}和\ textit {诱导变量}之间进行类比,并假设\ textit {prefixes}用作\ textit {诱导变量}将改善其整体机制。从内核估计器的角度来看,我们建议一种新的前缀调整 - \ textit {诱导者调整},该变体以前缀为tunsing的确切机制,同时利用适配器调整中的残差形式。这减轻了前缀调整中的初始化问题。通过关于自然语言理解和发电任务的全面经验实验,我们证明诱导者调整可以缩小前缀调整和微调之间的性能差距。
Prefix-tuning, or more generally continuous prompt tuning, has become an essential paradigm of parameter-efficient transfer learning. Using a large pre-trained language model (PLM), prefix-tuning can obtain strong performance by training only a small portion of parameters. In this paper, we propose to understand and further develop prefix-tuning through the kernel lens. Specifically, we make an analogy between \textit{prefixes} and \textit{inducing variables} in kernel methods and hypothesize that \textit{prefixes} serving as \textit{inducing variables} would improve their overall mechanism. From the kernel estimator perspective, we suggest a new variant of prefix-tuning -- \textit{inducer-tuning}, which shares the exact mechanism as prefix-tuning while leveraging the residual form found in adapter-tuning. This mitigates the initialization issue in prefix-tuning. Through comprehensive empirical experiments on natural language understanding and generation tasks, we demonstrate that inducer-tuning can close the performance gap between prefix-tuning and fine-tuning.