论文标题
Gernermed ++:德国医疗NLP中的转移学习
GERNERMED++: Transfer Learning in German Medical NLP
论文作者
论文摘要
我们为德国医学自然语言处理提供了一个统计模型,该模型训练了命名实体识别(NER),作为开放的公开模型。这项工作是我们第一个Gernerm模型的精致继任者,该模型的表现大大胜过我们的工作。我们证明了结合多种技术的有效性,以通过在验证的深层语言模型(LM),单词平衡和神经机器翻译上转移学习的方式来在实体识别绩效方面取得强大的结果。由于开放的公共医疗实体识别模型在德国文本上的稀疏情况,这项工作为医疗NLP作为基准模型的德国研究社区提供了好处。由于我们的模型基于公共英语数据,因此提供了其权重,而没有对使用和分发的法律限制。示例代码和统计模型可在以下网址获得:https://github.com/frankkramer-lab/gernermed-pp
We present a statistical model for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. The work serves as a refined successor to our first GERNERMED model which is substantially outperformed by our work. We demonstrate the effectiveness of combining multiple techniques in order to achieve strong results in entity recognition performance by the means of transfer-learning on pretrained deep language models (LM), word-alignment and neural machine translation. Due to the sparse situation on open, public medical entity recognition models for German texts, this work offers benefits to the German research community on medical NLP as a baseline model. Since our model is based on public English data, its weights are provided without legal restrictions on usage and distribution. The sample code and the statistical model is available at: https://github.com/frankkramer-lab/GERNERMED-pp