探索和预测NLP任务的可转移性

论文标题

探索和预测NLP任务的可转移性

Exploring and Predicting Transferability across NLP Tasks

论文作者

Vu, Tu, Wang, Tong, Munkhdalai, Tsendsuren, Sordoni, Alessandro, Trischler, Adam, Mattarella-Micke, Andrew, Maji, Subhransu, Iyyer, Mohit

论文摘要

NLP的最新进展证明了培训大规模语言模型并将其转移到下游任务的有效性。除了语言建模以外的任务上，可以微调这些模型进一步提高性能吗？在本文中，我们对三个广泛的问题（文本分类，问题答案和序列标签）之间的33个NLP任务之间的可传递性进行了广泛的研究。我们的结果表明，转移学习比以前想象的更有益，尤其是当目标任务数据稀缺时，即使源任务较小或与目标任务有很大不同（例如，词性标记的一部分传输转移到Drop QA DataSet）也可以提高性能。我们还开发了可用于预测给定目标任务最可转移的源任务的任务嵌入，并验证了它们在控制源和目标数据大小的实验中的有效性。总体而言，我们的实验表明，诸如源数据大小，任务和域相似性以及任务复杂性等因素都在确定可传递性中起作用。

Recent advances in NLP demonstrate the effectiveness of training large-scale language models and transferring them to downstream tasks. Can fine-tuning these models on tasks other than language modeling further improve performance? In this paper, we conduct an extensive study of the transferability between 33 NLP tasks across three broad classes of problems (text classification, question answering, and sequence labeling). Our results show that transfer learning is more beneficial than previously thought, especially when target task data is scarce, and can improve performance even when the source task is small or differs substantially from the target task (e.g., part-of-speech tagging transfers well to the DROP QA dataset). We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task, and we validate their effectiveness in experiments controlled for source and target data size. Overall, our experiments reveal that factors such as source data size, task and domain similarity, and task complexity all play a role in determining transferability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题