使用动态图卷积网络和多元化推断的键形提取

论文标题

使用动态图卷积网络和多元化推断的键形提取

Keyphrase Extraction with Dynamic Graph Convolutional Networks and Diversified Inference

论文作者

Zhang, Haoyu, Long, Dingkun, Xu, Guangwei, Xie, Pengjun, Huang, Fei, Wang, Ji

论文摘要

键形提取（KE）的目的是总结一组词组，这些短语准确地表达了给定文档中涵盖的概念或主题。最近，基于序列到序列（SEQ2SEQ）的生成框架被广泛用于KE任务，并且在各种基准测试中获得了竞争性能。 SEQ2SEQ方法的主要挑战在于获取信息丰富的潜在文档表示形式，并更好地建模目标钥匙拼套设置的组成性，这将直接影响生成的密钥源的质量。在本文中，我们建议采用动态图卷积网络（DGCN）同时解决以上两个问题。具体而言，我们探索以将依赖树与GCN整合到潜在的表示学习中。此外，根据生成的钥匙声，我们的模型中的图形结构在学习过程中动态修改。为此，我们的方法能够明确地学习键形集合中的关系，并保证在两个方向上编码器和解码器之间的信息互换。各种KE基准数据集的广泛实验证明了我们方法的有效性。

Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document. Recently, Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks. The main challenges of Seq2Seq methods lie in acquiring informative latent document representation and better modeling the compositionality of the target keyphrases set, which will directly affect the quality of generated keyphrases. In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously. Concretely, we explore to integrate dependency trees with GCN for latent representation learning. Moreover, the graph structure in our model is dynamically modified during the learning process according to the generated keyphrases. To this end, our approach is able to explicitly learn the relations within the keyphrases collection and guarantee the information interchange between encoder and decoder in both directions. Extensive experiments on various KE benchmark datasets demonstrate the effectiveness of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题