论文标题
知识图中的重复检测:文献和工具
Duplication Detection in Knowledge Graphs: Literature and Tools
论文作者
论文摘要
近年来,已经创建了越来越多的知识图(kgs),以存储跨域知识和数十亿个事实,这是诸如搜索引擎(例如搜索引擎)应用程序的基础。但是,KGS不可避免地存在可能产生冲突属性价值的重复项之类的不一致之处。重复检测(DD)旨在确定重复的实体并有效,有效地解决其矛盾的财产价值。在本文中,我们对DD方法和工具进行了文献综述,并对它们进行了评估。我们的主要贡献是对kgs中DD工具的性能评估,改进建议和DD工作流程,以支持DD工具的未来开发,这些工具基于本研究中检测到的理想功能。
In recent years, an increasing amount of knowledge graphs (KGs) have been created as a means to store cross-domain knowledge and billion of facts, which are the basis of costumers' applications like search engines. However, KGs inevitably have inconsistencies such as duplicates that might generate conflicting property values. Duplication detection (DD) aims to identify duplicated entities and resolve their conflicting property values effectively and efficiently. In this paper, we perform a literature review on DD methods and tools, and an evaluation of them. Our main contributions are a performance evaluation of DD tools in KGs, improvement suggestions, and a DD workflow to support future development of DD tools, which are based on desirable features detected through this study.