论文标题
一种半监督的学习方法,有两名教师,以改善对话中的分解识别
A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues
论文作者
论文摘要
确定正在进行的对话中的故障有助于提高沟通效率。关于此主题的大多数先前工作都依赖于人类注释的数据和数据增强来学习分类模型。虽然标有质量的对话数据需要人类注释,并且通常要获得昂贵,但未标记的数据易于从各种来源收集。在本文中,我们提出了一个新颖的半监督教师学习框架,以解决这一任务。我们介绍了两位接受标记数据和扰动标记数据的教师。我们利用未标记的数据来改善学生培训中的分类,在该分类中,我们雇用两名教师通过以自举的方式通过教师学习的方式来完善未标记数据的标签。通过我们提出的培训方法,学生可以对单教师的表现进行改进。对话分解检测挑战数据集DBDC5和学习确定后续问题数据集LIF的实验结果表明,我们的方法的表现优于所有先前已发表的方法以及其他受监督和半监督的基线方法。
Identifying breakdowns in ongoing dialogues helps to improve communication effectiveness. Most prior work on this topic relies on human annotated data and data augmentation to learn a classification model. While quality labeled dialogue data requires human annotation and is usually expensive to obtain, unlabeled data is easier to collect from various sources. In this paper, we propose a novel semi-supervised teacher-student learning framework to tackle this task. We introduce two teachers which are trained on labeled data and perturbed labeled data respectively. We leverage unlabeled data to improve classification in student training where we employ two teachers to refine the labeling of unlabeled data through teacher-student learning in a bootstrapping manner. Through our proposed training approach, the student can achieve improvements over single-teacher performance. Experimental results on the Dialogue Breakdown Detection Challenge dataset DBDC5 and Learning to Identify Follow-Up Questions dataset LIF show that our approach outperforms all previous published approaches as well as other supervised and semi-supervised baseline methods.