论文标题

KPI-EDGAR:一种新颖的数据集和随附的指标,用于从财务文件中提取关系

KPI-EDGAR: A Novel Dataset and Accompanying Metric for Relation Extraction from Financial Documents

论文作者

Deußer, Tobias, Ali, Syed Musharraf, Hillebrand, Lars, Nurchalifah, Desiana, Jacob, Basil, Bauckhage, Christian, Sifa, Rafet

论文摘要

我们介绍了Kpi-edgar,这是一种新颖的数据集,用于联合命名实体识别和关系提取建筑物上的基于上传到电子数据收集,分析和检索(EDGAR)系统的财务报告,其中的主要目的是从财务文档中提取关键绩效指标(KPIS),并将其链接到其数值值和其他属性和其他属性。我们进一步提供了四个伴随的基线,以基准潜在的未来研究。此外,我们提出了一种通过将单词级别的加权方案纳入常规F1得分中的新方法来衡量所述提取过程的成功,以更好地模拟该域中关系的实体对固有模糊边界。

We introduce KPI-EDGAR, a novel dataset for Joint Named Entity Recognition and Relation Extraction building on financial reports uploaded to the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system, where the main objective is to extract Key Performance Indicators (KPIs) from financial documents and link them to their numerical values and other attributes. We further provide four accompanying baselines for benchmarking potential future research. Additionally, we propose a new way of measuring the success of said extraction process by incorporating a word-level weighting scheme into the conventional F1 score to better model the inherently fuzzy borders of the entity pairs of a relation in this domain.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源