使用自然语言处理的对话的自动说服标记

论文标题

使用自然语言处理的对话的自动说服标记

Automated Utterance Labeling of Conversations Using Natural Language Processing

论文作者

Laricheva, Maria, Zhang, Chiyu, Liu, Yan, Chen, Guanyu, Tracey, Terence, Young, Richard, Carenini, Giuseppe

论文摘要

会话数据在心理学中至关重要，因为它可以帮助研究人员了解个人的认知过程，情感和行为。话语标签是分析此类数据的常见策略。 NLP算法的开发使研究人员可以自动化此任务。但是，心理对话数据给NLP研究人员带来了一些挑战，包括多标签分类，大量类别和有限的可用数据。这项研究探讨了NLP方法生成的自动标签如何与成年过渡对话的背景下的人类标签相媲美。我们提出了策略，以应对心理学研究中提出的三个常见挑战。我们的发现表明，具有领域适应性的深度学习方法（Roberta-Con）优于所有其他机器学习方法。我们提出的层次标记系统被证明是为了帮助研究人员战略分析对话数据。我们的Python代码和NLP模型可从https://github.com/mlaricheva/automated_labeling获得。

Conversational data is essential in psychology because it can help researchers understand individuals cognitive processes, emotions, and behaviors. Utterance labelling is a common strategy for analyzing this type of data. The development of NLP algorithms allows researchers to automate this task. However, psychological conversational data present some challenges to NLP researchers, including multilabel classification, a large number of classes, and limited available data. This study explored how automated labels generated by NLP methods are comparable to human labels in the context of conversations on adulthood transition. We proposed strategies to handle three common challenges raised in psychological studies. Our findings showed that the deep learning method with domain adaptation (RoBERTa-CON) outperformed all other machine learning methods; and the hierarchical labelling system that we proposed was shown to help researchers strategically analyze conversational data. Our Python code and NLP model are available at https://github.com/mlaricheva/automated_labeling.

下载PDF全文

下载文献需遵守相关版权规定

论文标题