基于上下文的报价建议

论文标题

基于上下文的报价建议

Context-Based Quotation Recommendation

论文作者

MacLaughlin, Ansel, Chen, Tao, Ayan, Burcu Karagol, Roth, Dan

论文摘要

在撰写新文档的同时，从新闻文章到电子邮件或论文，作者经常利用各种来源的直接报价。尽管作者可能知道他们想提出的观点，但是为特定上下文选择适当的报价可能很耗时且困难。因此，我们提出了一个新颖的上下文引用的报价建议系统，该系统利用作者已经编写的内容来生成从给定源文档中引用的段落和跨度的排名列表。我们将报价建议作为开放域问题的一种变体回答，并将基于BERT的最新方法从Open-QA调整为我们的任务。我们对语音成绩单和相关新闻文章的集合进行实验，评估模型的段落排名和跨度预测性能。我们的实验证实了基于BERT的方法在此任务上的强劲性能，在所有排名指标中，相对相对比相对30％以上。定性分析表明了段落的难度和跨度建议任务，并确认最佳BERT模型预测的引号，即使它们不是原始新闻文章中的真正选择的报价。

While composing a new document, anything from a news article to an email or essay, authors often utilize direct quotes from a variety of sources. Although an author may know what point they would like to make, selecting an appropriate quote for the specific context may be time-consuming and difficult. We therefore propose a novel context-aware quote recommendation system which utilizes the content an author has already written to generate a ranked list of quotable paragraphs and spans of tokens from a given source document. We approach quote recommendation as a variant of open-domain question answering and adapt the state-of-the-art BERT-based methods from open-QA to our task. We conduct experiments on a collection of speech transcripts and associated news articles, evaluating models' paragraph ranking and span prediction performances. Our experiments confirm the strong performance of BERT-based methods on this task, which outperform bag-of-words and neural ranking baselines by more than 30% relative across all ranking metrics. Qualitative analyses show the difficulty of the paragraph and span recommendation tasks and confirm the quotability of the best BERT model's predictions, even if they are not the true selected quotes from the original news articles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题