远处介绍关系提取的概述，重点是降级和预训练方法

论文标题

远处介绍关系提取的概述，重点是降级和预训练方法

An Overview of Distant Supervision for Relation Extraction with a Focus on Denoising and Pre-training Methods

论文作者

Hogan, William

论文摘要

关系提取（RE）是自然语言处理的基本任务。 RE试图通过识别文本中的实体对之间的关系信息来将原始的，非结构化的文本转换为结构化知识。 RE有许多用途，例如知识图完成，文本摘要，提问和搜索查询。 RE方法的历史可以分为四个阶段：基于模式的RE，基于统计的RE，基于神经的RE和大型基于语言模型的RE。这项调查首先概述了RE的早期阶段的一些模范作品，强调了局限性和缺点，以使进度与之相关。接下来，我们回顾流行的基准测试，并严格检查用于评估RE性能的指标。然后，我们讨论遥远的监督，这是塑造现代RE方法发展的范式。最后，我们回顾了重点介绍denoising和训练方法的最新工作。

Relation Extraction (RE) is a foundational task of natural language processing. RE seeks to transform raw, unstructured text into structured knowledge by identifying relational information between entity pairs found in text. RE has numerous uses, such as knowledge graph completion, text summarization, question-answering, and search querying. The history of RE methods can be roughly organized into four phases: pattern-based RE, statistical-based RE, neural-based RE, and large language model-based RE. This survey begins with an overview of a few exemplary works in the earlier phases of RE, highlighting limitations and shortcomings to contextualize progress. Next, we review popular benchmarks and critically examine metrics used to assess RE performance. We then discuss distant supervision, a paradigm that has shaped the development of modern RE methods. Lastly, we review recent RE works focusing on denoising and pre-training methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题