论文标题
从历史地图中对齐地理实体,用于构建知识图
Aligning geographic entities from historical maps for building knowledge graphs
论文作者
论文摘要
历史地图包含有关地区过去的丰富地理信息。有时,它们是数字地图可用性之前唯一的信息来源。尽管它们具有宝贵的内容,但由于其基于纸质地图或扫描的图像形式,在历史地图中访问和使用这些信息通常是具有挑战性的。进行分析需要从多个历史地图中综合信息的分析更加耗时和劳动密集型。为了促进历史地图中包含的地理信息的使用,一种方法是从中构建地理知识图(GKG)。本文提出了一个一般的工作流程,以完成构建此类GKG的一个重要步骤,即从不同地图中对齐相同的地理实体。我们介绍了此工作流程和实施的相关方法,并使用两个不同历史地图的数据集系统地评估其性能。评估结果表明,匹配地点名称的机器学习和深度学习模型对从训练数据中学到的阈值敏感,以及基于字符串相似性,空间距离和近似拓扑关系的措施的组合,可以达到最佳性能,平均F评分为0.89。
Historical maps contain rich geographic information about the past of a region. They are sometimes the only source of information before the availability of digital maps. Despite their valuable content, it is often challenging to access and use the information in historical maps, due to their forms of paper-based maps or scanned images. It is even more time-consuming and labor-intensive to conduct an analysis that requires a synthesis of the information from multiple historical maps. To facilitate the use of the geographic information contained in historical maps, one way is to build a geographic knowledge graph (GKG) from them. This paper proposes a general workflow for completing one important step of building such a GKG, namely aligning the same geographic entities from different maps. We present this workflow and the related methods for implementation, and systematically evaluate their performances using two different datasets of historical maps. The evaluation results show that machine learning and deep learning models for matching place names are sensitive to the thresholds learned from the training data, and a combination of measures based on string similarity, spatial distance, and approximate topological relation achieves the best performance with an average F-score of 0.89.