论文标题
联合国:通用命名实体识别框架
UNER: Universal Named-Entity RecognitionFramework
论文作者
论文摘要
我们介绍了通用命名实体识别(UNER)框架,一个4级分类层次结构以及为创建第一个多语言UNER语料库而采用的方法:setimesparallel coppus cansed for nuper-nater cansed in natime parl consed in nater-nater cansed in nater-nestimes sans and in命名。首先,将使用现有的工具和知识库来注释英语setimescorpus。通过众包活动来审视由此产生的注释,它们将自动传播到Se-Times Corpora中的其他语言。最后,作为外部评估,UNER Multilin-Gual数据集将用于训练和测试可用的NER工具。作为一部分研究方向,我们旨在增加联合国语料库中语言的数量,并研究将无与伦比的知识图集成以改善命名实体识别的可能方法。
We introduce the Universal Named-Entity Recognition (UNER)framework, a 4-level classification hierarchy, and the methodology that isbeing adopted to create the first multilingual UNER corpus: the SETimesparallel corpus annotated for named-entities. First, the English SETimescorpus will be annotated using existing tools and knowledge bases. Afterevaluating the resulting annotations through crowdsourcing campaigns,they will be propagated automatically to other languages within the SE-Times corpora. Finally, as an extrinsic evaluation, the UNER multilin-gual dataset will be used to train and test available NER tools. As part offuture research directions, we aim to increase the number of languages inthe UNER corpus and to investigate possible ways of integrating UNERwith available knowledge graphs to improve named-entity recognition.