论文标题
通过提示对话理解的弱监督数据扩展
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
论文作者
论文摘要
理解任务的对话通常需要大量的带注释的数据以实现良好的性能,并在低资源设置中提出挑战。为了减轻这一障碍,我们通过提示大型预训练的语言模型并提出一种新颖的方法来探索对话理解的很少的数据,并提出了一种新颖的方法,该方法通过应用弱监督的过滤器来迭代增强质量。我们评估了DailyDialog中情感和ACT分类任务的方法以及Facebook多语言面向任务的对话中的意图分类任务。在我们的增强数据上微调的模型与几乎没有射击的地面真相数据混合在一起,能够接近或超过两个数据集上现有的最新性能。对于DailyDialog,特别是使用10%的地面真相数据,我们优于使用100%数据的当前最新模型。
Dialogue understanding tasks often necessitate abundant annotated data to achieve good performance and that presents challenges in low-resource settings. To alleviate this barrier, we explore few-shot data augmentation for dialogue understanding by prompting large pre-trained language models and present a novel approach that iterates on augmentation quality by applying weakly-supervised filters. We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue. Models fine-tuned on our augmented data mixed with few-shot ground truth data are able to approach or surpass existing state-of-the-art performance on both datasets. For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.