神经抽象摘要，结构关注

论文标题

神经抽象摘要，结构关注

Neural Abstractive Summarization with Structural Attention

论文作者

Chowdhury, Tanya, Kumar, Sachin, Chakraborty, Tanmoy

论文摘要

基于RNN的注意编码器架构在新闻文章的抽象摘要方面取得了令人印象深刻的表现。但是，这些方法无法解决文档句子中的长期依赖关系。在多文件摘要任务中，此问题加剧了，例如总结社区问题回答（CQA）网站（例如yahoo!）中的大众意见答案和Quora。这些线程包含的答案通常是重叠或相互矛盾的。在这项工作中，我们基于结构上的关注来提出一个层次编码器，以模拟此类句子间和文档间依赖性。我们将流行的指针生成器架构和一些从其基准衍生而来的架构设置为基础，并表明它们无法在多文档设置中生成良好的摘要。我们进一步说明，我们所提出的模型在单一和多文档摘要设置中都对基准实现了显着改善 - 在以前的环境中，它在CNN和CQA数据集上分别以1.31和7.8 Rouge-1点的优势击败了最佳基线；在后一种情况下，CQA数据集上的性能进一步提高了1.6 Rouge-1点。

Attentional, RNN-based encoder-decoder architectures have achieved impressive performance on abstractive summarization of news articles. However, these methods fail to account for long term dependencies within the sentences of a document. This problem is exacerbated in multi-document summarization tasks such as summarizing the popular opinion in threads present in community question answering (CQA) websites such as Yahoo! Answers and Quora. These threads contain answers which often overlap or contradict each other. In this work, we present a hierarchical encoder based on structural attention to model such inter-sentence and inter-document dependencies. We set the popular pointer-generator architecture and some of the architectures derived from it as our baselines and show that they fail to generate good summaries in a multi-document setting. We further illustrate that our proposed model achieves significant improvement over the baselines in both single and multi-document summarization settings -- in the former setting, it beats the best baseline by 1.31 and 7.8 ROUGE-1 points on CNN and CQA datasets, respectively; in the latter setting, the performance is further improved by 1.6 ROUGE-1 points on the CQA dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题