论文标题
情感的单词和句子级别的训练预训练用于情感分析
Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis
论文作者
论文摘要
在情感分析任务中,大多数现有的预训练的语言表示模型(PLM)是次优的,因为它们从文字级别捕获了情感信息,同时考虑句子级别的信息。在本文中,我们提出了SentiWSP,这是一种新颖的情感训练的预训练的语言模型,具有合并的单词级别和句子级训练的预训练任务。一词级别的预训练任务检测到通过发电机 - 歧义框架替换了情感单词,以增强PLM对情感单词的了解。句子级别的预训练任务进一步通过对比度学习框架加强了歧视者,并具有与否定样本相似的句子,以编码句子中的情感。广泛的实验结果表明,SentiWSP在各种句子级别和方面的情感分类基准上实现了新的最先进的表现。我们已在https://github.com/xmudm/sentiwsp上公开提供代码和模型。
Most existing pre-trained language representation models (PLMs) are sub-optimal in sentiment analysis tasks, as they capture the sentiment information from word-level while under-considering sentence-level information. In this paper, we propose SentiWSP, a novel Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks. The word level pre-training task detects replaced sentiment words, via a generator-discriminator framework, to enhance the PLM's knowledge about sentiment words. The sentence level pre-training task further strengthens the discriminator via a contrastive learning framework, with similar sentences as negative samples, to encode sentiments in a sentence. Extensive experimental results show that SentiWSP achieves new state-of-the-art performance on various sentence-level and aspect-level sentiment classification benchmarks. We have made our code and model publicly available at https://github.com/XMUDM/SentiWSP.