论文标题

通过知识基础引导的预训练和同义词链接的生成生物医学实体 - 意识到微调

Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning

论文作者

Yuan, Hongyi, Yuan, Zheng, Yu, Sheng

论文摘要

实体位于生物医学自然语言理解的核心中,由于细粒度且坦率的概念名称,联系起来的生物医学实体(EL)任务仍然具有挑战性。生成方法在通用域EL中实现出色的性能,并且需要更少的内存使用,同时需要昂贵的预训练。先前的生物医学EL方法利用知识库(KB)的同义词,这并非微不足道地注入生成方法。在这项工作中,我们使用一种生成方法来建模生物医学EL,并建议在其中注入同义词知识。我们通过构建具有同义词和KB定义的合成样本来提出KB引导的预训练,并要求模型恢复概念名称。我们还提出了同义词意识的微调来为培训选择概念名称,并提出了解码器提示和多种词素的限制前缀树以进行推理。我们的方法在没有候选者选择的几项生物医学EL任务上实现了最新的结果,这些任务显示了拟议的预培训和微调策略的有效性。

Entities lie in the heart of biomedical natural language understanding, and the biomedical entity linking (EL) task remains challenging due to the fine-grained and diversiform concept names. Generative methods achieve remarkable performances in general domain EL with less memory usage while requiring expensive pre-training. Previous biomedical EL methods leverage synonyms from knowledge bases (KB) which is not trivial to inject into a generative method. In this work, we use a generative approach to model biomedical EL and propose to inject synonyms knowledge in it. We propose KB-guided pre-training by constructing synthetic samples with synonyms and definitions from KB and require the model to recover concept names. We also propose synonyms-aware fine-tuning to select concept names for training, and propose decoder prompt and multi-synonyms constrained prefix tree for inference. Our method achieves state-of-the-art results on several biomedical EL tasks without candidate selection which displays the effectiveness of proposed pre-training and fine-tuning strategies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源