共同自然语言理解和产生的生成模型

论文标题

共同自然语言理解和产生的生成模型

A Generative Model for Joint Natural Language Understanding and Generation

论文作者

Tseng, Bo-Hsiang, Cheng, Jianpeng, Fang, Yimai, Vandyke, David

论文摘要

自然语言理解（NLU）和自然语言产生（NLG）是建立具有相反目标的以任务为导向的对话系统的两个基本和相关任务：NLU解决了从自然语言到形式表示的转变，而NLG则相反。在任一任务中成功的关键是并行训练数据，这很昂贵。在这项工作中，我们提出了一个生成模型，该模型将NLU和NLG通过共享的潜在变量。这种方法使我们能够探索自然语言的两个空间和形式表示的空间，并促进通过潜在空间共享的信息，最终使NLU和NLG受益。我们的模型在两个对话数据集上实现了最新的性能，并具有平坦的正式形式表示。我们还表明，该模型可以通过使用未标记的数据来提高其性能来以半监督的方式进行培训。

Natural language understanding (NLU) and natural language generation (NLG) are two fundamental and related tasks in building task-oriented dialogue systems with opposite objectives: NLU tackles the transformation from natural language to formal representations, whereas NLG does the reverse. A key to success in either task is parallel training data which is expensive to obtain at a large scale. In this work, we propose a generative model which couples NLU and NLG through a shared latent variable. This approach allows us to explore both spaces of natural language and formal representations, and facilitates information sharing through the latent space to eventually benefit NLU and NLG. Our model achieves state-of-the-art performance on two dialogue datasets with both flat and tree-structured formal representations. We also show that the model can be trained in a semi-supervised fashion by utilising unlabelled data to boost its performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题