通过积极学习的低资源自然语言理解的微调BERT

论文标题

通过积极学习的低资源自然语言理解的微调BERT

Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning

论文作者

Grießhaber, Daniel, Maucher, Johannes, Vu, Ngoc Thang

论文摘要

最近，利用预先训练的基于变压器的语言模型在下流流中，特定于任务模型具有先进的最新状态，从而导致自然语言理解任务。但是，只有一些研究探讨了这种方法在低于1,000个培训数据点的低资源设置中的适用性。在这项工作中，我们通过利用基于池的主动学习来加快训练的速度，同时保持新数据的标记成本恒定，从而探索BERT的微调方法（一种基于预训练的变压器的语言模型）。我们对胶水数据集的实验结果通过从未标记数据池查询时，通过最大化模型的知识增益来显示模型性能的优势。最后，我们在微调过程中演示和分析了语言模型冻结层的好处，以减少可训练的参数的数量，从而更适合低资源设置。

Recently, leveraging pre-trained Transformer based language models in down stream, task specific models has advanced state of the art results in natural language understanding tasks. However, only a little research has explored the suitability of this approach in low resource settings with less than 1,000 training data points. In this work, we explore fine-tuning methods of BERT -- a pre-trained Transformer based language model -- by utilizing pool-based active learning to speed up training while keeping the cost of labeling new data constant. Our experimental results on the GLUE data set show an advantage in model performance by maximizing the approximate knowledge gain of the model when querying from the pool of unlabeled data. Finally, we demonstrate and analyze the benefits of freezing layers of the language model during fine-tuning to reduce the number of trainable parameters, making it more suitable for low-resource settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题