文档级别的关系提取和自适应局灶性损失和知识蒸馏

论文标题

文档级别的关系提取和自适应局灶性损失和知识蒸馏

Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation

论文作者

Tan, Qingyu, He, Ruidan, Bing, Lidong, Ng, Hwee Tou

论文摘要

与其句子级别的对应物相比，文档级关系提取（DOCRE）是一项更具挑战性的任务。它旨在一次从多个句子中提取关系。在本文中，我们为DOCRE提出了一个半监督框架，其中包含三个新型组件。首先，我们使用轴向注意模块来学习实体对之间的相互依赖性，从而改善了两跳高关系的性能。其次，我们提出了一种自适应焦点损失，以解决DOCR的类不平衡问题。最后，我们使用知识蒸馏来克服人类注释的数据与遥远监督数据之间的差异。我们在两个DOCRE数据集上进行了实验。我们的模型始终胜过强大的基线，其性能超过了先前的SOTA，在DOCRED排行榜上得分为1.36 f1和1.46 IGN_F1得分。我们的代码和数据将在https://github.com/tonytan48/kd-docre上发布。

Document-level Relation Extraction (DocRE) is a more challenging task compared to its sentence-level counterpart. It aims to extract relations from multiple sentences at once. In this paper, we propose a semi-supervised framework for DocRE with three novel components. Firstly, we use an axial attention module for learning the interdependency among entity-pairs, which improves the performance on two-hop relations. Secondly, we propose an adaptive focal loss to tackle the class imbalance problem of DocRE. Lastly, we use knowledge distillation to overcome the differences between human annotated data and distantly supervised data. We conducted experiments on two DocRE datasets. Our model consistently outperforms strong baselines and its performance exceeds the previous SOTA by 1.36 F1 and 1.46 Ign_F1 score on the DocRED leaderboard. Our code and data will be released at https://github.com/tonytan48/KD-DocRE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题