通过自然语言推断改善远距离监督的关系提取

论文标题

通过自然语言推断改善远距离监督的关系提取

Improving Distantly Supervised Relation Extraction by Natural Language Inference

论文作者

Zhou, Kang, Qiao, Qiao, Li, Yuepei, Li, Qi

论文摘要

为了减少人为提取（RE）任务的人类注释，提出了远距离监督的方法，同时在低性能方面挣扎。在这项工作中，我们提出了一个新颖的DSRE-NLI框架，该框架既考虑了现有知识库的遥远监督，又考虑了对其他任务的审计语言模型的间接监督。 DSRE-NLI通过半自动关系语言（SARV）机制为现成的自然语言推理（NLI）发动机提供了能量，以提供间接的监督并进一步巩固遥远的注释，以使多型分类重新模型有益。基于NLI的间接监督仅从人类中获取一个关系模板，作为每个关系的语义通用模板，然后由高质量的文本模式富集了从遥远注释的语料库中自动开采的高质量文本模式。有了两种简单有效的数据整合策略，培训数据的质量得到了显着提高。广泛的实验表明，所提出的框架可显着改善远距离监督的RE基准数据集上的SOTA性能（最高为F1的7.73％）。

To reduce human annotations for relation extraction (RE) tasks, distantly supervised approaches have been proposed, while struggling with low performance. In this work, we propose a novel DSRE-NLI framework, which considers both distant supervision from existing knowledge bases and indirect supervision from pretrained language models for other tasks. DSRE-NLI energizes an off-the-shelf natural language inference (NLI) engine with a semi-automatic relation verbalization (SARV) mechanism to provide indirect supervision and further consolidates the distant annotations to benefit multi-classification RE models. The NLI-based indirect supervision acquires only one relation verbalization template from humans as a semantically general template for each relationship, and then the template set is enriched by high-quality textual patterns automatically mined from the distantly annotated corpus. With two simple and effective data consolidation strategies, the quality of training data is substantially improved. Extensive experiments demonstrate that the proposed framework significantly improves the SOTA performance (up to 7.73\% of F1) on distantly supervised RE benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题