论文标题

通过认识句法依赖性和语义来改善中国故事的产生

Improving Chinese Story Generation via Awareness of Syntactic Dependencies and Semantics

论文作者

Huang, Henglin, Tang, Chen, Loakman, Tyler, Guerin, Frank, Lin, Chenghua

论文摘要

故事的产生旨在在给定的输入下产生长期的叙述。尽管在应用预训练的模型方面取得了成功,但中国故事的当前神经模型仍在努力产生高质量的长文本叙事。我们假设这源于句法解析中文的歧义,而汉语没有明确的划分单词分割。因此,神经模型遭受了中国叙事中特征的效率低下。在本文中,我们提出了一个新一代框架,该框架通过告知单词之间的依赖性模型并通过同义词来增强语义表示学习,从而增强了功能捕获机制。我们进行了一系列实验,结果表明,我们的框架在所有评估指标上都优于最新的中国生成模型,这表明了增强的依赖性和语义表示学习的好处。

Story generation aims to generate a long narrative conditioned on a given input. In spite of the success of prior works with the application of pre-trained models, current neural models for Chinese stories still struggle to generate high-quality long text narratives. We hypothesise that this stems from ambiguity in syntactically parsing the Chinese language, which does not have explicit delimiters for word segmentation. Consequently, neural models suffer from the inefficient capturing of features in Chinese narratives. In this paper, we present a new generation framework that enhances the feature capturing mechanism by informing the generation model of dependencies between words and additionally augmenting the semantic representation learning through synonym denoising training. We conduct a range of experiments, and the results demonstrate that our framework outperforms the state-of-the-art Chinese generation models on all evaluation metrics, demonstrating the benefits of enhanced dependency and semantic representation learning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源