命名实体识别的可解释的多数据集评估

论文标题

命名实体识别的可解释的多数据集评估

Interpretable Multi-dataset Evaluation for Named Entity Recognition

论文作者

Fu, Jinlan, Liu, Pengfei, Neubig, Graham

论文摘要

随着自然语言处理任务模型的扩散，更难理解模型及其相对优点之间的差异甚至更加困难。仅查看整体指标，例如准确性，BLEU或F1等整体指标之间的差异，就不会告诉我们为什么或特定方法的性能不同以及多样化的数据集对模型设计选择的影响如何。在本文中，我们提出了针对命名实体识别（NER）任务的可解释评估的一般方法。提出的评估方法使我们能够解释模型和数据集的差异，以及它们之间的相互作用，从而确定当前系统的优势和劣势。通过使我们的分析工具可用，我们使未来的研究人员可以轻松进行类似的分析并推动该领域的进度：https：//github.com/neulab/interpreteval。

With the proliferation of models for natural language processing tasks, it is even harder to understand the differences between models and their relative merits. Simply looking at differences between holistic metrics such as accuracy, BLEU, or F1 does not tell us why or how particular methods perform differently and how diverse datasets influence the model design choices. In this paper, we present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them, identifying the strengths and weaknesses of current systems. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area: https://github.com/neulab/InterpretEval.

下载PDF全文

下载文献需遵守相关版权规定

论文标题