通过知识基础引导的预训练和同义词链接的生成生物医学实体 - 意识到微调

论文标题

通过知识基础引导的预训练和同义词链接的生成生物医学实体 - 意识到微调

Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning

论文作者

Yuan, Hongyi, Yuan, Zheng, Yu, Sheng

论文摘要

实体位于生物医学自然语言理解的核心中，由于细粒度且坦率的概念名称，联系起来的生物医学实体（EL）任务仍然具有挑战性。生成方法在通用域EL中实现出色的性能，并且需要更少的内存使用，同时需要昂贵的预训练。先前的生物医学EL方法利用知识库（KB）的同义词，这并非微不足道地注入生成方法。在这项工作中，我们使用一种生成方法来建模生物医学EL，并建议在其中注入同义词知识。我们通过构建具有同义词和KB定义的合成样本来提出KB引导的预训练，并要求模型恢复概念名称。我们还提出了同义词意识的微调来为培训选择概念名称，并提出了解码器提示和多种词素的限制前缀树以进行推理。我们的方法在没有候选者选择的几项生物医学EL任务上实现了最新的结果，这些任务显示了拟议的预培训和微调策略的有效性。

Entities lie in the heart of biomedical natural language understanding, and the biomedical entity linking (EL) task remains challenging due to the fine-grained and diversiform concept names. Generative methods achieve remarkable performances in general domain EL with less memory usage while requiring expensive pre-training. Previous biomedical EL methods leverage synonyms from knowledge bases (KB) which is not trivial to inject into a generative method. In this work, we use a generative approach to model biomedical EL and propose to inject synonyms knowledge in it. We propose KB-guided pre-training by constructing synthetic samples with synonyms and definitions from KB and require the model to recover concept names. We also propose synonyms-aware fine-tuning to select concept names for training, and propose decoder prompt and multi-synonyms constrained prefix tree for inference. Our method achieves state-of-the-art results on several biomedical EL tasks without candidate selection which displays the effectiveness of proposed pre-training and fine-tuning strategies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题