论文标题
深度双向语言知识图预处理
Deep Bidirectional Language-Knowledge Graph Pretraining
论文作者
论文摘要
已经证明,在文本上预处理语言模型(LM)可以帮助各种下游NLP任务。最近的作品表明,知识图(kg)可以补充文本数据,提供结构化的背景知识,为推理提供了有用的脚手架。但是,这些作品尚未鉴定以了解这两种方式的大规模融合,从而限制了获得文本和kg的完全联合表示的潜力。在这里,我们提出了Dragon(深度双向语言知识图预处理),这是一种自我监督的方法,是从文本和KG按大规模预处理一种深入的联合语言知识基础模型。具体而言,我们的模型将成对的文本段和相关的KG子图作为输入和双向融合,从这两种模式中融合了信息。我们通过统一两个自制的推理任务,掩盖语言建模和kg链接预测来预算这一模型。在各种下游任务上,龙的表现优于现有的LM和LM +KG模型,包括在一般和生物医学领域的问题回答,平均绝对增益为 +5%。尤其是,龙在有关语言和知识的复杂推理上取得了显着的表现(涉及长上下文或多步推理的问题+10%)和低资源质量保证(OBQA和RIDDLESENSE的+8%),以及各种BionLP任务的新最先进的结果。我们的代码和训练有素的模型可在https://github.com/michiyasunaga/dragon上找到。
Pretraining a language model (LM) on text has been shown to help various downstream NLP tasks. Recent works show that a knowledge graph (KG) can complement text data, offering structured background knowledge that provides a useful scaffold for reasoning. However, these works are not pretrained to learn a deep fusion of the two modalities at scale, limiting the potential to acquire fully joint representations of text and KG. Here we propose DRAGON (Deep Bidirectional Language-Knowledge Graph Pretraining), a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale. Specifically, our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities. We pretrain this model by unifying two self-supervised reasoning tasks, masked language modeling and KG link prediction. DRAGON outperforms existing LM and LM+KG models on diverse downstream tasks including question answering across general and biomedical domains, with +5% absolute gain on average. In particular, DRAGON achieves notable performance on complex reasoning about language and knowledge (+10% on questions involving long contexts or multi-step reasoning) and low-resource QA (+8% on OBQA and RiddleSense), and new state-of-the-art results on various BioNLP tasks. Our code and trained models are available at https://github.com/michiyasunaga/dragon.