论文标题
哈萨克的神经命名实体认可
Neural Named Entity Recognition for Kazakh
论文作者
论文摘要
我们提出了几个神经网络,以解决形态复杂语言(MCL)指定实体识别的任务。哈萨克人是一种形态上复杂的语言,每个根/茎可以产生数百或数千种变异词形式。该语言的这种性质可能导致严重的数据稀疏问题,这可能会阻止深度学习模型对资源不足的MCL进行良好的培训。为了有效地对MCLS的单词进行建模,我们将嵌入根和实体标签加上张量层介绍到神经网络。这些影响对于改善MCL的NER模型性能很重要。所提出的模型的表现优于最先进的方法,包括基于角色的方法,并且可以潜在地应用于其他形态复杂的语言。
We present several neural networks to address the task of named entity recognition for morphologically complex languages (MCL). Kazakh is a morphologically complex language in which each root/stem can produce hundreds or thousands of variant word forms. This nature of the language could lead to a serious data sparsity problem, which may prevent the deep learning models from being well trained for under-resourced MCLs. In order to model the MCLs' words effectively, we introduce root and entity tag embedding plus tensor layer to the neural networks. The effects of those are significant for improving NER model performance of MCLs. The proposed models outperform state-of-the-art including character-based approaches, and can be potentially applied to other morphologically complex languages.