自然语言生成的文本重新组合方法

论文标题

自然语言生成的文本重新组合方法

A Text Reassembling Approach to Natural Language Generation

论文作者

Li, Xiao, van Deemter, Kees, Lin, Chenghua

论文摘要

近年来，有许多基于统计技术的自然语言产生（NLG）的建议。尽管具有许多吸引人的功能，但我们认为这些现有的方法仍然具有一些重要的缺点，有时是因为所讨论的方法并不完全统计（即依赖于一定程度的手工制作），有时是因为所讨论的方法缺乏透明度。我们专注于一些关键的NLG任务（即内容选择，词汇选择和语言实现），我们提出了一种新颖的方法，称为NLG（TRG）的文本方法（TRG），该方法非常紧密地使用了纯粹的统计方法的理想，并且同时又非常透明。我们评估了TRG方法，并讨论如何扩展TRG来处理其他NLG任务，例如文档结构和聚合。我们讨论了TRG的优势和局限性，得出的结论是，尽管在语言学和NLG方面几乎没有专业知识，但该方法可能对希望建立NLG系统的领域专家具有特殊的希望。

Recent years have seen a number of proposals for performing Natural Language Generation (NLG) based in large part on statistical techniques. Despite having many attractive features, we argue that these existing approaches nonetheless have some important drawbacks, sometimes because the approach in question is not fully statistical (i.e., relies on a certain amount of handcrafting), sometimes because the approach in question lacks transparency. Focussing on some of the key NLG tasks (namely Content Selection, Lexical Choice, and Linguistic Realisation), we propose a novel approach, called the Text Reassembling approach to NLG (TRG), which approaches the ideal of a purely statistical approach very closely, and which is at the same time highly transparent. We evaluate the TRG approach and discuss how TRG may be extended to deal with other NLG tasks, such as Document Structuring, and Aggregation. We discuss the strengths and limitations of TRG, concluding that the method may hold particular promise for domain experts who want to build an NLG system despite having little expertise in linguistics and NLG.

下载PDF全文

下载文献需遵守相关版权规定

论文标题