DOCASREF：一项关于重新利用基于参考的摘要质量指标的实证研究。

论文标题

DOCASREF：一项关于重新利用基于参考的摘要质量指标的实证研究。

DocAsRef: An Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely

论文作者

Bao, Forrest Sheng, Tu, Ruixuan, Luo, Ge, Yang, Yinfei, Li, Hebi, Qiu, Minghui, He, Youbiao, Chen, Cen

论文摘要

自动摘要质量评估分为两类：基于参考和无参考。基于参考的指标，由于人类写的参考文献提供的其他信息，因此被认为更准确的指标受到对人类投入的依赖的限制。在本文中，我们假设某些基于参考的指标使用的比较方法可以有效地对其源代码文档进行评估，以评估其相应的参考，从而将这些指标转换为无参考指标。实验结果支持这一假设。在被重新使用后，使用<0.5B参数的Deberta-large-Mnli模型的零击bertscore始终优于其原始参考版本，跨萨华库和新闻室数据集的各个方面。与大多数现有的无参考指标相比，它也表现出色，并与基于GPT-3.5的零摄像摘要评估者紧密竞争。

Automated summary quality assessment falls into two categories: reference-based and reference-free. Reference-based metrics, historically deemed more accurate due to the additional information provided by human-written references, are limited by their reliance on human input. In this paper, we hypothesize that the comparison methodologies used by some reference-based metrics to evaluate a system summary against its corresponding reference can be effectively adapted to assess it against its source document, thereby transforming these metrics into reference-free ones. Experimental results support this hypothesis. After being repurposed reference-freely, the zero-shot BERTScore using the pretrained DeBERTa-large-MNLI model of <0.5B parameters consistently outperforms its original reference-based version across various aspects on the SummEval and Newsroom datasets. It also excels in comparison to most existing reference-free metrics and closely competes with zero-shot summary evaluators based on GPT-3.5.

下载PDF全文

下载文献需遵守相关版权规定

论文标题