论文标题
使用EUPEG增强空间和文本分析:一个可扩展的统一平台,用于评估地理阶程
Enhancing spatial and textual analysis with EUPEG: an extensible and unified platform for evaluating geoparsers
论文作者
论文摘要
非结构化的文本中存在着丰富的地理信息,例如网页,社交媒体帖子,住房广告和历史档案。地理阶层是有用的工具,可以从非结构化文本中提取结构化的地理信息,从而对文本数据进行空间分析。虽然开发了许多地理训练器,但使用不同的指标在不同的数据集上进行了测试。因此,很难比较现有的Geoparsers或将新的Geoparser与现有的地理阶段进行比较。近年来,研究人员创建了开放和注释的Corpora来测试地球兵。尽管这些语料库非常有价值,但对于研究人员来说,仍然需要做很多努力,以准备这些数据集并部署地理选拔器进行比较实验。本文介绍了Eupeg:一个可扩展的统一平台,用于评估地理票据。 EUPEG是一个开源和基于Web的基准测试平台,它拥有大多数开放语料库,地理票数和绩效指标。它可以直接比较托管的Geoparsers,并且可以将新的地理码器连接到Eupeg并与其他地理阶层相比。 EUPEG的主要目的是减少研究人员在准备数据集和基线的时间和精力,从而提高比较实验的效率和有效性。
A rich amount of geographic information exists in unstructured texts, such as Web pages, social media posts, housing advertisements, and historical archives. Geoparsers are useful tools that extract structured geographic information from unstructured texts, thereby enabling spatial analysis on textual data. While a number of geoparsers were developed, they were tested on different datasets using different metrics. Consequently, it is difficult to compare existing geoparsers or to compare a new geoparser with existing ones. In recent years, researchers created open and annotated corpora for testing geoparsers. While these corpora are extremely valuable, much effort is still needed for a researcher to prepare these datasets and deploy geoparsers for comparative experiments. This paper presents EUPEG: an Extensible and Unified Platform for Evaluating Geoparsers. EUPEG is an open source and Web based benchmarking platform which hosts a majority of open corpora, geoparsers, and performance metrics reported in the literature. It enables direct comparison of the hosted geoparsers, and a new geoparser can be connected to EUPEG and compared with other geoparsers. The main objective of EUPEG is to reduce the time and effort that researchers have to spend in preparing datasets and baselines, thereby increasing the efficiency and effectiveness of comparative experiments.