通过预训练的语言模型的解释图生成：对比度学习的实证研究

论文标题

通过预训练的语言模型的解释图生成：对比度学习的实证研究

Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning

论文作者

Saha, Swarnadeep, Yadav, Prateek, Bansal, Mohit

论文摘要

预训练的序列到序列语言模型已在许多自然语言生成任务中取得了广泛的成功。但是，分析其生成结构化输出（例如图形）的能力的工作相对较少。与自然语言不同，图形在下游NLP任务的上下文中具有独特的结构和语义属性，例如，生成连接的图形和无环的图形可以归因于其结构约束，而图的语义可以指出两个节点概念之间的边缘是多么有意义地代表两个node概念之间的关系。在这项工作中，我们研究了预训练的语言模型，这些模型以端到端的方式生成解释图，并分析其学习此类图的结构约束和语义的能力。我们首先表明，在有限的监督下，预先训练的语言模型通常会产生违反这些约束或在语义上不连贯的图形。由于策划大量的人类注销图是昂贵且乏味的，因此我们提出了通过节点和边缘编辑操作的简单而有效的方法来绘制扰动的方法，这些操作在结构和语义上是正面的正面和负面图。接下来，我们利用具有最大利润和Infonce损失的不同对比度学习模型中的这些图。我们的方法可以显着改善解释图的结构和语义准确性，并推广到其他类似的图生成任务。最后，我们表明人类错误是对比度学习的最佳负面因素，并且自动产生更多类似人类的负面图可能会导致进一步的改进。我们的代码和模型可在https://github.com/swarnahub/explagraphgen上公开获取。

Pre-trained sequence-to-sequence language models have led to widespread success in many natural language generation tasks. However, there has been relatively less work on analyzing their ability to generate structured outputs such as graphs. Unlike natural language, graphs have distinct structural and semantic properties in the context of a downstream NLP task, e.g., generating a graph that is connected and acyclic can be attributed to its structural constraints, while the semantics of a graph can refer to how meaningfully an edge represents the relation between two node concepts. In this work, we study pre-trained language models that generate explanation graphs in an end-to-end manner and analyze their ability to learn the structural constraints and semantics of such graphs. We first show that with limited supervision, pre-trained language models often generate graphs that either violate these constraints or are semantically incoherent. Since curating large amount of human-annotated graphs is expensive and tedious, we propose simple yet effective ways of graph perturbations via node and edge edit operations that lead to structurally and semantically positive and negative graphs. Next, we leverage these graphs in different contrastive learning models with Max-Margin and InfoNCE losses. Our methods lead to significant improvements in both structural and semantic accuracy of explanation graphs and also generalize to other similar graph generation tasks. Lastly, we show that human errors are the best negatives for contrastive learning and also that automatically generating more such human-like negative graphs can lead to further improvements. Our code and models are publicly available at https://github.com/swarnaHub/ExplagraphGen

下载PDF全文

下载文献需遵守相关版权规定

论文标题