论文标题
OntoMerger:用于重复数据删除和连接知识图节点的本体集成库
OntoMerger: An Ontology Integration Library for Deduplicating and Connecting Knowledge Graph Nodes
论文作者
论文摘要
当从异质数据集中构建知识图(kg)时,节点的重复是一个常见的问题,在该数据集中,能够合并具有相同含义的节点至关重要。 Ontomerger是一个Python本体集成库,其功能是重写kg节点。我们的方法采用了一组kg节点,映射和断开连接的层次结构,并与连接的层次结构一起生成了一组合并的节点。此外,库提供了分析和数据测试功能,可用于微调输入,进一步减少重复以及增加输出图的连接性。 Ontomerger可以应用于各种本体和公斤。在本文中,我们介绍了Ontomerger,并在现实世界中的生物医学KG上说明了其功能。
Duplication of nodes is a common problem encountered when building knowledge graphs (KGs) from heterogeneous datasets, where it is crucial to be able to merge nodes having the same meaning. OntoMerger is a Python ontology integration library whose functionality is to deduplicate KG nodes. Our approach takes a set of KG nodes, mappings and disconnected hierarchies and generates a set of merged nodes together with a connected hierarchy. In addition, the library provides analytic and data testing functionalities that can be used to fine-tune the inputs, further reducing duplication, and to increase connectivity of the output graph. OntoMerger can be applied to a wide variety of ontologies and KGs. In this paper we introduce OntoMerger and illustrate its functionality on a real-world biomedical KG.