论文标题
一个更好的选择:方面感性三重萃取的整个空间数据集
A Better Choice: Entire-space Datasets for Aspect Sentiment Triplet Extraction
论文作者
论文摘要
方面情感三胞胎提取(ASTE)旨在从句子中提取方面术语,情感和意见术语三胞胎。由于最初用于评估ASTE模型的数据集存在缺陷,因此一些研究后来纠正了初始数据集并独立发布了数据集的新版本。结果,不同的研究选择了不同版本的数据集来评估其方法,这使ASTE相关的工作很难遵循。在本文中,我们分析了不同版本的数据集之间的关系,并建议将整个空间版本用于ASTE。除了包含三胞胎和句子中的三胞胎的句子外,整个空间版本还包括没有三胞胎的句子以及不属于任何三重态的方面术语。因此,整个空间版本与现实世界的场景一致,并且在整个空间版本上评估模型可以更好地反映模型在实际场景中的性能。此外,实验结果表明,在非进入空间数据集上评估模型会夸大在整个空间版本上训练的现有模型和模型的性能可以获得更好的性能。
Aspect sentiment triplet extraction (ASTE) aims to extract aspect term, sentiment and opinion term triplets from sentences. Since the initial datasets used to evaluate models on ASTE had flaws, several studies later corrected the initial datasets and released new versions of the datasets independently. As a result, different studies select different versions of datasets to evaluate their methods, which makes ASTE-related works hard to follow. In this paper, we analyze the relation between different versions of datasets and suggest that the entire-space version should be used for ASTE. Besides the sentences containing triplets and the triplets in the sentences, the entire-space version additionally includes the sentences without triplets and the aspect terms which do not belong to any triplets. Hence, the entire-space version is consistent with real-world scenarios and evaluating models on the entire-space version can better reflect the models' performance in real-world scenarios. In addition, experimental results show that evaluating models on non-entire-space datasets inflates the performance of existing models and models trained on the entire-space version can obtain better performance.