文档级抽象摘要

论文标题

文档级抽象摘要

Document-Level Abstractive Summarization

论文作者

Raposo, Gonçalo, Raposo, Afonso, Carmo, Ana Sofia

论文摘要

自动文本摘要的任务可产生简洁而流利的文本摘要，同时保留关键信息和整体含义。近年来，通过使用基于变压器体系结构的模型，对文档级摘要的最新方法已经取得了重大改进。但是，按照文档级摘要要求，相对于序列长度的二次记忆和时间复杂性使它们使用非常昂贵，尤其是在长序列中。我们的工作通过研究如何使用有效的变压器技术来改善很长的文本的自动摘要，从而解决了文档级汇总的问题。特别是，我们将使用由几篇科学论文和相应的摘要组成的ARXIV数据集作为这项工作的基准。然后，我们根据体系结构提出了一种新颖的检索增强方法，该方法通过处理较小的块来降低整个文档摘要的成本。结果低于基线，但暗示了更有效的记忆一种消费和真实性。

The task of automatic text summarization produces a concise and fluent text summary while preserving key information and overall meaning. Recent approaches to document-level summarization have seen significant improvements in recent years by using models based on the Transformer architecture. However, the quadratic memory and time complexities with respect to the sequence length make them very expensive to use, especially with long sequences, as required by document-level summarization. Our work addresses the problem of document-level summarization by studying how efficient Transformer techniques can be used to improve the automatic summarization of very long texts. In particular, we will use the arXiv dataset, consisting of several scientific papers and the corresponding abstracts, as baselines for this work. Then, we propose a novel retrieval-enhanced approach based on the architecture which reduces the cost of generating a summary of the entire document by processing smaller chunks. The results were below the baselines but suggest a more efficient memory a consumption and truthfulness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题